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CHAPTER  ONE 
INTRODUCTION 

Development  is  the  central  phase  in  the  software 
design  life  cycle;  it  absorbs  at  least  75  percent  of  the 
cost  of  a  piece  of  new  software  (program)  [Pressmanl982 ] . 
Decisions  made  in  this  phase  will  ultimately  affect  the 
success  of  the  implementation  and  maintenance  of  the 
software.  In  spite  of  the  importance,  the  management  of 
software  development  is  very  difficult.  Preset  schedules 
and  completion  dates  for  a  software  system  can  seldom  be 
kept.  The  quality  of  the  system  more  often  than  not 
becomes  suspect  as  its  size  grows.  These  difficulties  can 
be  attributed  to  the  limited  amount  of  historical  data 
available  to  guide  a  software  manager  in  controlling  the 
progress  of  the  software  development  project.  Therefore, 
the  ability  to  identify  and  evaluate  the  historical  data 
of  a  program  during  its  development  phase  is  urgently 
desired;  it  renders  the  manager  of  a  software  development 
team  able  to  not  only  monitor  the  quality  of  the  program 
but  also  regulate  the  software  development  cost  and 
schedule. 

Traditionally,  the  quality  of  software  development 
can  be  monitored  by  the  technique  of  complexity  measures. 
This   technique  tries  to  measure  human  factors  that  affect 


software  development.  Two  classical  complexity  measures 
are  McCabe's  Complexity  Measure  [McCabel976]  and 
Halstead's  metrics  [Halstead  1977].  While  both  measures 
are  sophisticated  and  mathematically  sound,  neither 
provides  a  vehicle  for  a  quick  estimation  of  the  progress 
of  software  development. 

Dunsmore  and  Gannon  [Dunsmorel977]  had  a  very 
different  view  on  estimating  software  complexity.  They 
proposed  a  measure  of  complexity  to  be  the  number  of 
"program  changes"  that  must  be  made  from  the  initial 
version  of  a  program  until  it  is  in  a  final  form.  The 
same  concepts  were  found  to  be  employed  later  in  analyzing 
the  style  of  C  programs  [Berryl985]  and  in  evaluating 
software  development  [Weissl985].  Recently,  Lanchbury 
[Lanchburyl986]  proposed  a  model  to  evaluate  the  progress 
of  a  program  during  its  development  cycle.  The  model  is 
empirically  oriented;  it  derives  software  code  change 
patterns  from  a  successful  project.  The  model  aids  a 
software  manager  to  monitor  the  change  pattern  of  a 
program. 

The  purpose  of  this  work  is  to  propose  a  change 
classification  and  a  set  of  intuitive  rules  for  effective 
evaluation  of  the  program  change  patterns  during  software 
development.  The  work  is  basically  an  extension  of 
Lanchbury 's  work.   It  is  important  that  a  software  manager 
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sees  and  interprets  the  pattern  changes  during  software 
development.  The  intuitive  rules  are  designed  to 
facilitate  the  interpretation  of  those  changes. 

In  addition  to  this  chapter,  this  thesis  contains  4 
chapters.  Chapter  2  delineates  the  nature  of  the  program 
change  data  selected  to  be  analyzed  and  the  procedures  to 
collect  these  data.  Chapter  3  presents  qualitatively  the 
process  of  classifying  the  program  change  data.  This  is 
followed  by  a  quantitative  discussion  of  the 
classification  in  Chapter  4.  The  discussion  also  leads  to 
the  proposal  of  a  set  of  intuitive  rules  for  program 
progress  analysis  using  the  pattern  classification. 
Concluding  remarks  and  recommendation  to  future  work  are 
given  in  the  last  chapter,  Chapter  5. 
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CHAPTER  TWO 
DATA  COLLECTION 

The  changes  in  a  program,  occurring  during  the 
software  development  stage,  can  be  analyzed  based  on 
several  types  of  data  pertinent  to  the  program.  Measures 
for  determining  the  progress  of  the  software  development 
are  extracted  from  the  analyzed  results.  A  software 
manager  can  use  these  measures  to  evaluate  the  progress  of 
a  program  during  its  development  stage. 

This  chapter  discusses  the  collection  of  data.  The 
nature  of  the  sample  programs  under  examination  is 
discussed  in  the  first  section.  This  is  followed  in  the 
second  section  by  a  description  of  some  utility  software 
employed  in  this  work.  In  the  third  section,  the  program 
CHANGES  is  presented  in  detail.  CHANGES  takes  a  pair  of 
programs  as  the  inputs  and  yields  the  file,  ma i n . resu I ts , 
as  the  output;  the  output  file  contains  data  about  the 
differences  between  the  input  program  pair.  A  discussion 
on  the  organization  of  this  data  is  given  in  the  last 
section. 

2.1  SELECTION  OF  PROGRAMS  TO  BE  ANALYZED 

All  the  programs  analyzed  in  the  present  work  were 
written   in   the   programming   language   C   in   the   Unix 
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environment.  The  programs  were  written  by  undergraduate 
students  in  a  fundamental  software  engineering  class 
(course  number  CMPSC541,  one  of  the  core  courses  in  the 
undergraduate  curriculum  in  the  Department  of  Computing 
and  Information  Sciences  at  Kansas  State  University) . 
Software  design  methodologies  were  taught  in  the  class. 
Students  were  asked  to  design  a  program  based  on  those 
methodologies.  Successive  versions  of  the  same  program 
were  saved  during  the  course  of  development;  they  served 
as  the  sample  programs. 

A  pair  of  programs,  which  are  two  different  versions 
of  the  same  program,  are  selected  for  analysis;  more  than 
sixty  pairs  of  programs  have  been  analyzed  in  this  study. 
Programs  are  paired  based  on  their  size  and  coding  date. 
Intuitively,  two  programs  with  the  smallest  differences  in 
size  and  coding  dates  are  successive  versions  of  a 
program;  they  are  grouped  as  a  pair. 

2.2  UTILITY  PROGRAMS  FOR  DATA  COLLECTION 

Three  utility  programs,  namely  COUNT,  NESTING, 
TYPEPGM,  were  designed  as  C-shell  programs  in  Unix;  each 
of  these  programs  is  a  complex  awk  program.  Awk  is  a  Unix 
data  manipulation  tool  [Bournel987] ;  it  is  a  pattern 
matching    language   and   report   generator.    The   three 
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programs  are  described  individually  in  the  following  sub- 
sections. 

2.2.1  COUNT  (usage:  awk  -f  COUNT  programf  i  I e) 

COUNT  counts  the  indentation  levels  for  each  line  of 
statement  in  a  program.  The  input  program  must  be  in  a 
file  designated  by  programf i  I e .  Programf i I e  must  be  in  a 
pretty-print  format;  this  can  be  achieved  by  preprocessing 
programf  i I e  using  the  Unix  command  cb.  The  result  of 
COUNT  is  stored  in  a  file,  countflle.  As  an  example, 
Appendix  A  is  a  sample  C  program,  named  Samp  I ePrograml , 
whose  indentation  levels  are  obtained  by  the  utility 
program  COUNT  and  reproduced  in  Table  2.1.  The  source 
code  of  COUNT  is  given  in  Appendix  C. 

2.2.2  NESTING  (usage:  awk  -f  NESTING  countflle) 

NESTING  takes  the  output  file  of  COUNT,  designated  by 

countf 1 1 e ,      and  yields   the  statistics  of  the  indentation 

levels  of  a  program.   The  statistics  are   stored   in  an 

output   file;   they   include   the   total   number   and   the 

percentage  of   each  indentation   level.    The   latter   is 

calculated  by 

percentage  of  level  N  indentation 

=  (total  number  of  level  N  indentation  *  100) 
/  total  number  of  lines  of  code. 

Moreover,   the  average  indentation  level  of  the  count f 1 1 e 

is  defined  as 
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indentation  level 

=  (zero  +  one*2  +  two*3  +  three*4  +  four*5 
+  five*6  +  six*7)  *  100 
/  total  number  of  lines  of  code. 

where   zero,   one,   ...,   represent   the   total  numbers  of 

indentation  level  zero,  one,  ...,  respectively.   Table  2.2 

illustrates   the   indentation  statistics  of  Sampl eprograml 

in  which  Zeroave  denotes   the   percentage   of   level   zero 

indentation,  Oneave  denotes  that  of  level  one  indentation, 

etc.   The  source  code  of  NESTING  is  given  in  Appendix  D. 

2.2.3   TYPEPGM  (usage:  awk  -f  TYPEPGM  programf i I e) 

TYPEPGM  calculates  the  total  number  of  occurrences  of 

each   statement    type    in   the  program   in   the   file, 

programf i I e .   Nineteen   (19)   statement   types   have   been 

defined.    "For   . . . " ,   "while   .  .  . " ,   and   "if   ..."   are 

examples  of  different  statement  types;  a  complete  list   of 

statement   types  are  shown  in  Table  2.3.   The  program  can 

be  in  either  pretty-print  format  or  any  other   free-style 

format .    The  total  number  of  statements  in  the  program  is 

also  recorded.   Each  line  in  the  program  is   analyzed  and 

the   type   is   recorded.    The  weight   of   the  program  is 

calculated   according   to    the    following    formula 

[Gustafsonl985] , 

weight   =  18.4  *  count [ "declaration" ] 
+  11.4  *  count["if"] 
+  7.9  *  count ["for"] 
+  8.5  *  count ["while"] 
+  6.8  *  count ["switch"] 

2-4 


+  5.6  *  count [ "case" ] 

+  4.6  *  count [ "preprocessor" ] 

+  11.1  *  count ["goto"] 

+  2.4  *  count [ "comment" ] 

in  which  the  weighting  of  each  statement  type  is  defined 

based  on  the  frequencies  of  change  of  individual  statement 

types.   Note  that  only  9  out  of  the  19  statement  types  are 

found  in  the  formula  above.   This  is  based  on  the  previous 

research   result    which   found   that   the   remaining   10 

statement  types  changed  in  negligible  frequencies  compared 

to   those   listed.   Subsequent  research  in  the  maintenance 

phase  (Anl987)   has   shown   that   the   program  weight   is 

correlated   to   changes   during   maintenance.   The  average 

weight  of  an  input  program  is  also  obtained  in  TYPEPGM;  it 

is  calculated  by  [Gustaf sonl985] 

average  weight  =  weight  /  total  number  of   lines 
of  code 

Table  2.3  gives  the  result   of   processing  Samp  I ePrograml 

using  TYPEPGM.    The  source  codes  of  TYPEPGM  are  given  in 

Appendix  E. 

2.3  MAIN  PROGRAM  FOR  DATA  COLLECTION 

A  C-shell  program,  CHANGES,  is  constructed  to  combine 
the  utility  programs  described  in  the  preceeding  sections 
with  Unix  data  manipulation  tools  and  C-shell  commands  in 
collecting  data  for  program  change  analysis.  The  data 
manipulating  tools  in  use  are 
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diff      find  the  differences  between  two  files, 

grep     match  patterns  in  a  set  of  files,  and 

sed        edit  a  stream. 
The  C-shell  commands  employed  include: 

echo      echo  a  message. 

cb  beautify  a   program   into   an   appropriate 

indentation  format. 
More  details  on  the  Unix  data  manipulation   tools   and   C- 
shell   commands  can  be  found  elsewhere  (see,  e.g.,  Bourne, 
1987) . 

CHANGES  takes  two  programs  as  inputs  and  generates  an 
output  file  containing  information  of  individual  input 
files  and  of  the  differences  between  the  input  files.  The 
command 

CHANGES  program)  program2 
invokes  the  execution  of  the  program  CHANGES.  Note  that 
for  the  best  results  program]  and  program!  should  be 
chosen  based  on  the  criteria  discussed  in  section  2.1. 
Figure  2.1  depicts  a  data  flow  diagram  of  CHANGES  whose 
complete  listing  is  given  in  Appendix  F.  The  processes 
found  in  the  data  flow  diagram  are 

Calculating  Occurrence  of  Statement  Types, 

Pretty-Printing  a  Program, 

Summing  the  Indentation  Level, 

Finding  the  Differences,  and 
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STATEMENTS 
OF  VERSION    1 


main.results 


STATEMENTS 
OF  VEfSION 


STATEMENTS 
CF   VERSION 


STATEMENTS 
OF  VERSION   2 


STATMENTS 
OF  VEfSION    2 


NUMBER  OF 
INDENTATION 


Figure  2. 1   Data-flow  diagram  for  the  program  CHANGES. 
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Extracting  the  Changed  Statements.  . 
The   function   of  each  process  is  explained,  respectively, 
in  each  of  the  following  sub-sections. 

2.3.1   Calculating  Occurrence  of  Statement  Types 

This  process  counts  the  total  number  of  occurrences 
of  various  types  of  statements  in  an  input  program  by 
using  the  utility  program,  TYPEPGM.  The  sed  command  is 
used  inside  the  process  to  perform  a  global  stream  editing 
before  TYPEPGM  is  executed.  To  be  exact, 
sed  's/'V  "  /g 

•/)/)  /g 

s/{/  {/g' 

substitute  globally  """  with  "  "  " ,  "}"  with  "}  " ,  and  " {" 
with  "  {".  This  assures  that  the  key  word  of  each 
statement  type  would  not  be  obscured  by  some  leading  or 
trailing  symbols.   For  example,  a  statement 

(if  (NF  ==  0)  (if  ... 
will  be  transformed  into 

{  if  (NF  ==0)  {  if  ... 
where  in  the  latter  statement,  the  statement  key  word  "if" 
can  be  read  clearly  by  TYPEPGM.    In  this  process,   the 
input    program  pair   are   processed   independently;   the 
results  are  saved  in  the  ma  I n . resu I ts    file. 
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2.3.2  Pretty-Printing  a  Program 

This  process  pretty-prints  an  input  program  by  using 
the  Unix  C-shell  command,  cb.  Specifically,  it  employs 
the  command 

cb  <old-file>    new-file. 
Again,  the  input  program  pair  are  processed   individually. 
The   output  of  this  process,  new-f  i I e ,    is  ready  to  be  used 
as  an  input  for  the  utility  program  COUNT. 

2.3.3  Summing  the  Indentation  Level 

This  process  uses  two  awk  programs  described  in 
section  2.2,  namely,  COUNT  and  NESTING.   By  executing 

awk  -f  COUNT  </ nput-program>    |  awk  -f  NESTING, 
the  process  counts  the  indentation  level  for  each  line  of 
code  and  yields  the  statistics  of  the   indentation   levels 
of   the   input   program.    The  results  of  this  process  are 
also  saved  in  the  file  ma I n . resu I t s . 

2.3.4  Finding  the  Differences 

This  process   finds   the   differences   between    two 
versions   of    the   same  program  by  using  Unix  data 
manipulation  tools,  diff   and  grep.      The  former   finds   the 
differences  between  a  pair  of  input  files.   For  example, 
diff  -e  programf  i I  el    programf I  I e2 
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lists  lines  that  must  be  changed  in  programf i I el  to  bring 
it  into  agreement  with  programf  i I e2 .  The  option  "-e" 
renders  the  results  be  recorded  in  a  script  of  a,  c  and  d 
commands  where  a  means  "statement  added",  c  means 
"statement  changed",  and  d  means  "statement  deleted"; 
these  commands,  when  used  in  the  Unix  editor  ed ,  will 
recreate  programf  i I e2  from  programf I  I  el .  By  piping  the 
results  of  dlff    to  grep,    or  more  specifically, 

diff  -e  programf i I  el  programf i I e2|grep  '"[0-9]*. 
The  differences  between  programf  i  I  el  and  programf  i I e2  are 
captured  in  a  set  of  change  indexes,  each  of  which  is  a 
line  in  the  format  of 

line#[a|c] 
or 

linestart#,lineend#d 
where  the  former  implies  some  codes  have  been  added  after 
line  number  line#  or  the  specified  line  has  been  changed 
in  programf 1 1  el ,  and  the  latter  implies  that  lines  number 
linestart#  to  lineend#  have  been  deleted  in  programf  i I  el . 
The  results  of  this  step  are  not  saved;  instead,  it  is 
directly  piped  to  the  following  process. 

2.3.5   Extracting  the  Changed  Statements 

This  process  has  two  inputs.   One  is  a  program,  e.g., 
programf i I  el ;  the  other   is   the   change   indexes   of   the 
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program.  To  facilitate  the  extraction  of  changed 
statements  from  the  input  program  based  on  the  change 
indexes,  those  indexes  obtained  via  the  process  "Finding 
the  Difference"  need  to  be  pre-processed  by  the  sed 
command  followed  by  an  awk  program.  This  awk  program  re- 
presents each  change  index  in  a  line  expression  of  the 
format 

NR  ==  NR#  {print  " [a| b| c |d] " , $0  ;i=l} 
where  NR#   is   the   line  number  of  the  statement  that  has 
been  modified  (added,  blank,  corrected,  or  deleted) .    The 
change    indexes   in   their   line   expression   format   are 
temporarily  stored  in  the  file,  result.      By  executing 

awk  -f  result  programf i I  el , 
all  statements  that  have  been  modified  will  be  collected, 
while  each  statement  is  prefixed  by  an  appropriate  label, 
a,  c,  or  d.  To  complete  the  extraction  of  changed 
statements,  the  prefixed  statements  are  piped  to  another 
awk  program;  this  awk  program  singles  out  those  statements 
prefixed  with  c  and  stored  them  in  a  file,  temp.  Temp  is 
in  turn  processed  by  sed  command,  and  the  result  is  stored 
in  the  file,  final,    which  is  the  output  of  the  process. 

2.4   DISCUSSION 

Sixty-four   pairs   of   programs  have  been  analyzed  in 
the  present  research;  each  pair  of  the  programs  belongs  to 
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one  of  twelve  different  programs  designed  by  eight  teams. 
After  a  pair  of  program  is  processed  by  CHANGES  as 
described  in  section  2.3,  the  results  are  appended  to  the 
file,  mai n. results.  Table  2.4  is  a  sample  listing  of 
mai n .resu Its  which  contains  the  results  of  analyzing  one 
pair  of  programs.  (Source  Codes  for  two  programs  are 
given  in  Appendices  A  and  B  respectively.)  Notice  that 
the  listing  is  divided  into  four  blocks  separated  by 
double  broken  lines.  The  first  block  of  data  are 
statistics  for  the  first  program  of  the  program  pair.  The 
second  block  of  data  are  statistics  for  the  second 
program.  The  third  block  of  data  are  statistics  for  the 
changes  found  in  the  program  pair  with  both  programs  being 
pre-processed  by  the  command  cb.  The  fourth  block  of  data 
are  statistics  for  the  changes  found  in  the  program  pair 
without  each  program  being  pre-processed  by  the  command 
cb. 

The  elasped  time  for  executing  a  pair  of  programs 
depends  on  the  average  size  of  the  program  pair.  The 
larger  the  average  size,  the  longer  the  elasped  time. 
Table  2.5  summarizes  elapsed  times  for  executing  different 
pairs  of  programs  and  the  average  sizes  of  respective  pair 
of  programs. 

The  data  obtained  in  the  VAX  Unix  environment  have 
been   transferred   to  an  Apple  Macintosh  personal  computer 
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for  further  analysis.   Two  awk  programs,   PICK  and   SEP, 

were   employed  to  facilitate  the  transferring.   The  former 

strips  off  all  descriptive  part  of  the   data   and   retains 

only  the  file  names,  the  count  of  each  statement  type,  and 

the  count  of  each  indentation  level.   The  latter  separates 

data   pertaining   to   different   pairs   of   programs   into 

independent  files.   Appendices  G   and  H   are   the   source 

codes  for  PICK  and  SEP,  respectively. 

Excel  [Townsendl985]  has  been  chosen  on  Macintosh  as 

the   tool  for  organizing  data  for  program  change  analysis; 

it  is  a  spreadsheet  software  package.   Table  2.6  gives   an 

example   of   data   represented  in  Excel .   The  Table  can  be 

visualized  to  contain  two  blocks  of  data.   The  first  block 

of   data,   comprising   those   in   columns   A,  B  and  C,  are 

direct   representation  of   data   transferred    from   VAX 

environment.    The  second  block  of  data,  comprising  those 

in  columns  D,  E  and  F,  are  the  same  as  those  in  the   first 

block   except  WEIGHT,  LOC  (LINES  OF  CODE),  TOTAL  AVE,  SUM, 

ZERO SIX.   The  numbers  of  WEIGHT,  LOC,  SUM  and  TOTAL 

AVE   are  normalized;  the  formulas  to  normalize  WEIGHT  and 

TOTAL  AVE  are 

normalized  number  of  WEIGHT  or  TOTAL  AVE 
=  (WEIGHT  or  TOTAL  AVE)  /  100 

As  an  example,  the  data  in  cell  22E  is  calculated  by 

257.40002  /  100  =  2.57. 
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The  formulas  to  normalize  LOC  and  SUM  are 

normalized  LOC 
=  LOC  /  10 

For  example,  the  data  in  cell  23E  is  obtained  by 

104  /  10  =  10.4. 

Recall  that  ZERO,  ONE,  . . .  represent  the  total   counts   of 

indentation   level   zero,   one,   . . .   respectively;   these 

counts  are  normalized  in  the  second  block  by  the  formula 

normalized  counts  of  indentation  level  N 
=  (total  counts  of  indentation  level  N 
/  Lines  of  codes)  *  10 

As  an  example,  the  data  in  cell  24E  is  calculated  by 

(41/104)  *  10  =  3.94. 

The  data  contained  in  the  second  block  have   been  further 

expressed   in   a   bar-chart   format;   the   results   are 

illustrated  in  Figures  2.2  to  2.4.   All  data  collected  by 

means  described   in  this   chapter  will   be  analyzed  and 

discussed  in  detail  in  the  next  chapter. 
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Figure  2.2  Program  differences  represented  in  a  bar  chart  in  terms  of  various  statement  types 
and  lines  of  code. 
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Figure  2.3  Program  differences  represented  in  a  bar  chart  in  terms  of  normalized  indentation 
level. 


2-16 


10- 


PROGRAMS  IN  PRETTY  PRINTED  fORMAT 
PROGRAMS  NOT  IN  PRETTY  PRINTED  FORMAT 


I 


ll  IN 


CO   '   AS    '    PR   '   CO   '    BL   '  RE    ■    IN    ■  OU  ■    FU    '    DE   '  DE   ■  WE 

NT        SI         EP       MM      AN  TU       PU       TP        NC       CI  FA        IG 

IN       GN       RO       EN       KL  RN        T         UT        Tl       AR  UL       HT 

UE       ME       CE         T         IN  ON       AT  T 

NT       SS  E  10 

OR  N 


WE 


Figure  2.4  Program  differences  represented  in  a  bar  chart  in  terms  of  various  statement  types 
and  lines  of  code. 
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Table  2. 1   Results  of  COUNT  program  for  SamplePrograml. 


statement 

statement 

statement 

1-35 

36  -  70 

71  -  104 

0 

0 

1 

0 

1 

1 

0 

0 

2 

0 

1 

2 

0 

0 

3 

0 

1 

3 

0 

0 

2 

0 

1 

2 

0 

2 

2 

0 

0 

3 

0 

1 

3 

0 

1 

2 

0 

0 

1 

0 

1 

0 

0 

1 

1 

0 

2 

2 

0 

2 

1 

0 

3 

2 

0 

3 

3 

0 

3 

2 

0 

3 

3 

0 

3 

4 

0 

3 

3 

0 

2 

4 

0 

2 

1 

0 

2 

1 

0 

3 

1 

0 

3 

2 

0 

4 

2 

0 

0 

1 

0 

3 

1 

1 

2 

1 

1 

1 

1 

1 

0 

0 

1 

1 
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Table  2.2  Results  of  NESTING  program  for  SamplePrograml. 


Levels : 

ZERO 

41 

ONE 

26 

TOO 

18 

THREE 

16 

FOUR 

3 

FIVE 

0 

SIX 

0 

ZEROAVE  = 

39.423 

ONEAVE  = 

25.000 

TOOAVE  = 

17.308 

THREEAVE  = 

15.385 

FOURAVE  = 

2.885 

FIVEAVE  = 

0.000 

SIXAVE  = 

0.000 

TOTAL  AVE 

217.308 

SUM  = 

104 

LINES  OF  CODE  =         104 

SUM/LINES  : 

1.000 
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Table  2.3  Results  of  TYPEPGM  program  for  SamplePrograml . 


FOR 

1 

WHILE 

3 

IF 

6 

ELSE 

5 

SWITCH 

0 

CASE 

0 

GOTO 

0 

BREAK 

0 

CONTINUE 

0 

ASSIGNMENT 

25 

PREPROCESSOR 

6 

COMMENT 

15 

BLANKLINE 

15 

RETURN 

0 

INPUT 

0 

OUTPUT 

10 

FUNCTION 

3 

DECLARATION 

5 

DEFAULT 

0 

WEIGHT  = 

257.40002 

LINES  OF  CODE  = 

104 

WEIGHT/LINES  = 

2.475 
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Table  2.4  Listing  of  Main.results  for  SamplePrograml  and  SampleProgram2. 


%*%  This  is  start  of  analysis  data 
Sun  Nov     8  16:03:56  CST  1987 


File  Name 

: 

SamplePrograml 

FOR 

1 

WHILE 

3 

IF 

6 

ELSE 

5 

SWITCH 

0 

CASE 

0 

GOTO 

0 

BREAK 

0 

CONTINUE 

0 

ASSIGNMENT 

25 

PREPROCESSOR 

6 

COMMENT 

15 

BLANKLINE 

15 

RETURN 

0 

INPUT 

0 

OUTPUT 

10 

FUNCTION 

3 

DECLARATION 

5 

DEFAULT 

0 

WEIGHT  = 

257 

.40002 

LINES  OF  CODE 

= 

104 

WEIGHT/LINES  : 

2.475 

Levels  : 

ZERO 

41 

ONE 

26 

TWO 

18 

THREE 

16 

FOUR 

3 

FIVE 

0 

SIX 

0 

ZEROAVE  = 

39 

.423 

ONEAVE  = 

25 

TWOAVE  = 

17 

.308 

THREEAVE  = 

15 

.385 

FOURAVE  ■ 

2. 

885 

FIVEAVE  ■ 

0. 

000 

SIXAVE  = 

0. 

000 

TOTAL  AVE 

= 

217.308 

SUM  = 

104 

LINES  OF  CODE 

= 

104 

SUM/LINES  : 

1.000 
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File  Name 


SampleProgram2 


FOR 

2 

WHILE 

3 

IF 

8 

ELSE 

7 

SWITCH 

0 

CASE 

0 

GOTO 

0 

BREAK 

0 

CONTINUE 

0 

ASSIGNMENT 

28 

PREPROCESSOR 

6 

COMMENT 

39 

BLANKLINE 

22 

RETURN 

0 

INPUT 

0 

OUTPUT 

10 

FUNCTION 

7 

DECLARATION 

5 

DEFAULT 

0 

WEIGHT  = 

LINES  OF  CODE  = 

WEIGHT/LINES  = 


345.70001 

120 

2.88083 


Levels 


ZERO 

ONE 

TWO 

THREE 

FOUR 

FIVE 

SIX 


49 

27 
19 
13 
12 
1 
0 


ZEROAVE  = 
ONEAVE  = 
TWOAVE  = 
THREEAVE  = 
FOURAVE  = 
FIVEAVE  = 
SIXAVE  = 
TOTAL  AVE 
SUM  = 

LINES  OF  CODE 
SUM/LINES  : 


40.496 

22.314 

15.702 

10.744 

9.917 

0.826 

0.000 


229.752 

121 

121 

1.000 
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File  Name  :    changes. with.CB 


FOR 

WHILE 

IF 

ELSE 

SWITCH 

CASE 

GOTO 

BREAK 

CONTINUE 

ASSIGNMENT 

PREPROCESSOR 

COMMENT 

BLANKLINE 

RETURN 

INPUT 

OUTPUT 

FUNCTION 

DECLARATION 

DEFAULT 


1 
3 
1 
1 
0 
0 
0 
0 
0 
13 
0 
5 
1 
0 
0 
4 
2 
1 
0 


WEIGHT  = 

75. 

20000 

LINES  OF  CODE 

= 

37 

WEIGHT/LINES  - 

2. 

03243 

Levels  : 

ZERO 

7 

ONE 

16 

TWO 

9 

TTJRRR 

5 

FOUR 

0 

FIVE 

0 

SIX 

0 

ZEROAVE  - 

18.919 

ONEAVE  = 

43.243 

TWOAVE  = 

24.324 

THREEAVE  = 

13.514 

FOURAVE  = 

0.000 

FIVEAVE  = 

0.000 

SIXAVE  = 

0.000 

TOTAL  AVE 

= 

232.432 

SUM  = 

37 

LINES  OF  CODE 

= 

37 

SUM/LINES 

' 

1.000 
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File  Name  : 

changes 

.without.CB 

FOR 

1 

WHILE 

3 

IF 

2 

ELSE 

1 

SWITCH 

0 

CASE 

0 

GOTO 

0 

BREAK 

0 

CONTINOE 

0 

ASSIGNMENT 

14 

PREPROCESSOR 

0 

COMMENT 

5 

BLANKLINE 

0 

RETURN 

0 

INPUT 

0 

OUTPUT 

3 

FUNCTION 

2 

DECLARATION 

1 

DEFAULT 

0 

WEIGHT  ■ 

86 

.60001 

LINES  OF  OODE 

39 

WEIGHT/LINES 

2 

.22051 

Sun  Nov     8  16:05:10  CST  1987 
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Table  2.5  Summary  of  the  elapsed  times  for  executing  the  64  pairs  of  programs  and  the 
average  sizes  of  respective  pair  of  programs. 


SET  OF 
PROGRAM 

1A 

IB 

1C 

2 

3 

4 

5 

6 

7 

8 

9 

10 

11 

12 

elapsed 
t  ime 

7:35 

6  :59 

7:39 

0  :43 

1  :10 

7  :40 

0  :48 

6  :43 

3  :31 

3  :39 

1  :38 

1  :20 

1  :17 

0  :53 

size 

435 

423 

431 

27 

56 

439 

63 

769 

105 

82 

195 

42 

134 

52 

elapsed 
tine 

6  :43 

6:28 

9  :53 

0  :49 

0:55 

9  :26 

0  :46 

21:27 

2:15 

3  :42 

3  :58 

1  :00 

1  :30 

1  :15 

size 

433 

435 

452 

59 

56 

446 

68 

1305 

107 

88 

309 

51 

133 

61 

elapsed 
t  ime 

8:42 

8:14 

6:13 

0  :42 

1  :40 

4:18 

0  :48 

18  :46 

5:43 

2  :47 

7  :15 

1  :42 

1  :16 

1  :23 

size 

429 

448 

434 

58 

67 

446 

70 

1795 

110 

106 

412 

82 

140 

87 

elapsed 
tine 

8:26 

0:41 

2:06 

7:28 

0  :S0 

11:27 

3  :40 

15:17 

1  :42 

1  :37 

2:06 

size 

446 

36 

67 

464 

66 

2216 

116 

436 

80 

163 

112 

elapsed 
tine 

8:59 

34:34 

1  :31 

3:05 

0:13 

1:34 

size 

449 

2413 

118 

424 

58 

12S 

elapsed 
tine 

19:01 

4:52 

1  :14 

1:22 

size 

1456 

448 

51 

90 

elapsed 
tine 



2  :54 

s  ize 

472 
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Table  2.6  Sample  data  represented  in  Excel;  the  data  reveal  the  difference  between  a  pair  of 
programs. 


A 

B 

C 

D 

E 

F 

1 

PI 

P2 

2     FOR 

1 

2 

FOR 

1 

2 

3     WHILE 

3 

3 

WHILE 

3 

3 

4    IF 

<> 

8 

IF 

(. 

8 

5     ELSE 

5 

7 

El  .SE 

5 

7 

6     SWITCH 

1) 

0 

SWITCH 

0 

0 

7     CASE 

0 

0 

CASE 

0 

0 

8     GOTO 

0 

0 

GOTO 

0 

0 

9     BREAK 

0 

1) 

BREAK 

t) 

0 

10  CONTINUE 

0 

0 

CONTINUE 

0 

0 

11  ASSIGNMENT 

25 

2« 

ASSIGNMENT 

25 

28 

12  PREPROCESSOR          6 

<) 

PREPROCESSOR 

0 

6 

13  COMMENT 

15 

39 

COMMENT 

15 

39 

14  BLANKLINL 

15 

22 

BLANKLINE 

15 

22 

15  RETURN 

0 

0 

RETURN 

0 

0 

16  INPUT 

0 

0 

INPUT 

0 

0 

17  OUTPUT 

10 

10 

OUTPUT 

10 

10 

18  FUNCTION 

3 

7 

FUNCTION 

3 

7 

19  DECLARATION 

5 

5 

DECLARATION 

5 

5 

20  DEFAULT 

0 

0 

DEFAULT 

0 

0 

21  WEIGHT 

257.4 

345.70 

WEIGHT 

2.57 

3.46 

22  LOC 

104 

120 

LOC 

10.4 

12 

23  WEIGHT/l.OC 

2.88 

2.48 

WEIGHT/LOC 

2.48 

2.88 

24  ZERO 

41 

49 

ZERO 

3.94 

4.08 

25  ONE 

2(, 

27 

ONE 

2.50 

2.25 

26  TWO 

IS 

19 

TWO 

1.73 

1.58 

27  THREE 

16 

13 

THREE 

1.54 

1.08 

28  FOUR 

3 

12 

FOUR 

0.29 

0.08 

29  FIVE 

0 

1 

FIVE 

0 

0 

30  SIX 

0 

0 

SIX 

0 

0 

31  ZEROAVE 

39.42 

40.50 

TOTAL  AVE 

2.17 

2.30 

32  ONEAVE 

25.00 

22.31 

SUM 

10.4 

12.1 

33  TWOAVE 

17.31 

15.70 

LOC 

10.4 

12.1 

34  THREEAVT: 

15.39 

10.74 

SUM/LOC 

1 

1 

35  FOURAVE 

2.89 

9.92 

36  FIVEAVE 

0 

0.83 

37  SIXAVE 

0 

0 

38  TOTAL  AVE 

217.31 

229.75 

39  SUM 

104 

121 

40  l.OC 

104 

121 

41   SUM/ IOC 

1 

1 

2-26 


42 

WITH 

WITHOUT 

43 

CB 

CB 

44  FOR 

1 

1 

FOR 

1 

1 

45  WHILE 

3 

3 

WHILE 

3 

3 

46  II 

l 

2 

IF 

1 

2 

47  else 

1 

0 

else 

1 

1 

48  SWITCH 

0 

0 

SWITCH 

0 

0 

49  CASE 

0 

0 

CASE 

0 

0 

50  GOTO 

n 

0 

GOTO 

0 

G 

51   BREAK 

0 

0 

BREAK 

0 

G 

52  CONTINUE 

0 

0 

CONTINUE 

0 

0 

53  ASSIGNMENT 

13 

14 

ASSIGNMENT 

13 

14 

54  PREPROCESSOR 

0 

0 

PREPROCESSOR 

0 

0 

55  COMMENT 

5 

5 

COMMENT 

5 

5 

56  BLANKLINE 

1 

0 

B1ANKLINE 

1 

0 

57  RETURN 

0 

0 

RETURN 

0 

0 

58  INPUT 

0 

0 

INPUT 

0 

0 

59  OUTPUT 

4 

3 

OUTPUT 

4 

3 

60  FUNCTION 

j 

2 

FUNCTION 

2 

2 

61   DECLARATION 

1 

1 

DECLARATION 

1 

1 

62  DEFAULT 

0 

0 

DEFAULT 

0 

0 

63  WEIGHT 

75.2 

86.6 

WEIGHT 

0.75 

0.87 

64  LOC 

37 

39 

LOC 

3.7 

3.9 

65  WEIGHT/LOC 

2.03 

2.22 

WEIGHT/LOC 

2.03 

2.22 

66  ZERO 

7 

ZERO 

1.89 

67  ONE 

16 

ONE 

4.32 

68  TWO 

9 

TWO 

2.43 

69  THREE 

5 

THREE 

1.35 

70  FOUR 

0 

FOUR 

0 

71  FIVE 

0 

FIVE 

0 

72  SIX 

0 

SIX 

0 

73  ZEROAVE  18.92 

74  ONEAVE  43.24 

75  TWOAVE  24.32 

76  THREEAVE  13.51 

77  FOURAVE  0 

78  FIVEAVE  0 

79  SIXAVE  0 

80  TOTAL  AVE  232.43 

81  SUM  37 

82  LOC  37 

83  SUM/LOC  1 
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CHAPTER  THREE 
CLASSIFICATION  OF  PROGRAM  CHANGES 

The  classification  of  program  changes  can  help  a 
software  manager  in  assessing  the  progress  of  a  program. 
Such  a  classification  can  be  determined  after  data 
concerning  the  changes  of  a  program  development  are  read 
and  the  patterns  are  classified.  While  issues  on  the  data 
collection  have  been  discussed  in  the  previous  chapter  and 
those  on  intuitive  rules  will  be  elaborated  in  the  next 
chapter,  the  process  of  classifying  the  program  change 
data  will  be  studied  in  this  chapter. 

The  first  section  of  this  chapter  presents  a 
preliminary  analysis  of  the  program  change  data  collected. 
This  is  followed  by  a  detailed  analysis  in  the  second 
section.  Classification  of  program  change  patterns  based 
on  the  results  of  the  analysis  is  proposed  in  the  third 
section.  The  advantages  of  the  proposed  classification 
types  are  discussed  in  the  last  section. 

3.1   PRELIMINARY  ANALYSIS 

Three  kinds  of  statistics  were  examined  in  the 
preliminary  analysis;  they  were 


3-1 


(1)  the  difference  of  the  counts  of  each 
statement  type  between  a  pair  of  non-pretty- 
printed  programs, 

(2)  the   difference   of   the    counts    of    each 
indentation   level   between  a  pair  of  pretty- 
printed  programs, 

(3)  total  number  of  statements  for  each  type 
which  have  been  modified  between  a  pair  of 
pretty-printed  programs,  and 

(4)  the  number  of  statements  for  each  type  which 
have  been  modified  between  a  pair  of  non- 
pretty-printed  programs. 

These   statistics   provide   data  to  estimate  qualitatively 
the  progress  of  a  software  development  project. 

As  an  example,  the  counts  of  each  statement  type  for 
a  non-pretty-printed  program  pair  and  those  of  each 
indentation  level  for  a  pretty-printed  program  pair  are 
reproduced  in  Table  3.1.  From  the  Table,  we  observe  that 
on  one  hand  the  counts  of  two  (out  of  nineteen)  statement 
types,  namely,  DECLARATION  and  OUTPUT,  decrease  from  the 
first  version  to  the  second  version.  On  the  other  hand, 
six  statement  types,  namely,  FOR,  IF,  ASSIGNMENT,  COMMENT, 
BLANKLINE,  and  FUNCTION,  have  their  counts  increase  from 
the  first  version  to  the  second  version.  The  counts  of 
the   remaining   eleven  statement  types  are  the  same  in  the 

3-2 


two  versions.  These  statistics  reflect  a  poor  design  or 
incomplete  design  specifications;  they  indicate  a  lack  of 
progress,  so  development  would  still  be  continuing 
(Lanchburyl986) . 

Also  in  Table  3.1,  the  counts  for  the  indentation 
levels  of  ZERO,  ONE,  TWO,  and  THREE  in  the  second  version 
are  less  than  those  in  the  first;  the  counts  of  the 
remaining  indentation  levels  in  the  second  version  are 
greater  than  those  in  the  first.  The  changes  in  the 
counts  of  various  indentation  levels  indicate  an  existence 
of  structural  changes  in  the  program. 

Table  3 . 2  presents  a  set  of  sample  data  for 
preliminary  analyses  (3)  and  (4).  Data  in  column  B  were 
obtained  after  two  versions  of  a  program  have  been 
converted  into  appropriate  indentation,  i.e.,  pretty- 
printing.  Data  in  column  C  were  obtained  based  on  two 
versions  of  a  non-pretty-printed  program.  Comparing 
column  B  with  column  C,  we  see  that  the  data  in  the  latter 
are  consistently  greatly  than  those  in  the  former.  Such  a 
pattern  typifies  the  changes  of  a  piece  of  software 
effected  by  pretty-printing. 

3.2   DETAILED  ANALYSIS 

While  the  preliminary  analyses  are  straightforward 
and   enable   a   quick  estimate  to  the  progress  of  software 
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development,  they  do  not  provide  insight  into  the 
progress.  Nonetheless,  it  is  worth  mentioning  that  data 
collected  in  the  form  of  Tables  3.1  and  3.2  actually 
contain  more  information  than  revealed  in  the  preliminary 
analyses.  A  detailed  analysis  at  data  may  suggest  that  a 
variety  of  activities  can  be  emphasized  during  the 
development  of  a  program.  For  example,  some  development 
may  emphasize  the  enhancement  which  means  that  different 
types  of  statements  are  added  in  a  program;  others  may 
emphasize  the  deletion  which  means  that  different  types  of 
statements  are  removed  from  a  program.  In  this  work,  ten 
classes  of  program  change  patterns  have  been  summarized  to 
embody  most  activities  occurring  in  software  development. 
This  classification  of  program  change  patterns  is  based  on 
a  detailed  analysis  of  data  belonging  to  sixty-four  pairs 
of  program;  the  patterns  include: 

(1)  Debugging, 

(2)  Documentation, 

(3)  Correction, 

(4)  Pretty-printing, 

(5)  Reconstruction, 

(6)  Removing  documentation, 

(7)  Removing  functionality, 

(8)  Adding  functionality, 

(9)  Removing  debugging,  and 
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(10)  Redistribution. 
The  definition  for  each  of  the  ten  patterns  can  be   found 
in  Table  3.3.   Notice  that  more  than  one  pattern  may  occur 
when  a  program  changes  from  a  version  to  the  next. 

3.3   DISCUSSION   ON   THE   CLASSIFICATION  OF  PROGRAM  CHANGE 

PATTERNS 

Ten  types  of  classification  have  been  identified. 
The  characteristics  for  each  type  of  patterns  are 
described  in  the  following  sub-sections. 

3.3.1  Debugging 

When  a  program  is  debugged,  the  counts  of  output 
statements  will  be  increased.  The  statements  are  added  to 
monitor  the  behavior  of  the  program;  they  include 
"putchar",  "putc" ,  "printf",  "fprintf",  "printw" ,  "write", 
"puts"  and  "fputs".  Figure  3.1  gives  a  sample  program 
change  data  set  belonging  to  a  program  under  debugging. 
In  the  Figure,  we  observe  that  a  number  of  output 
statements  have  been  added  to  the  second  version  of  the 
program. 

3.3.2  Documentation 

Documentation  refers  to  adding  comments  to  a  program; 
it  makes  the  program  more   understandable.    Needless   to 
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Figure  3.1   Sample  program  change  data  set  belonging  to  a  program  under  debugging, 
documentation  and  correction. 
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say,  documentation  results  in  an  increase  in  the  count  of 
the  comment  statements.  For  example,  eight  additional 
comments  have  been  added  when  the  program  changes  from  one 
version  to  the  next  as  depicted  in  Figure  3.1. 

3.3.3   Correction 

A  program  change  pattern  is  categorized  as  that  of 
correction  when  the  following  conditions  happen. 

(1)  Number  of  lines  of  code  shows  minor 
difference  between  two  successive  versions 
of  the  program. 

(2)  The  counts  of  the  control  statements, 
FUNCTION,  DECLARATION  and  ASSIGNMENT  show 
minor  changes  between  two  successive 
versions  of  the  program.  The  control 
statements  includes  FOR,  WHILE,  IF,  ELSE, 
SWITCH,  CASE,  GOTO,  BREAK,  RETURN  and 
CONTINUE. 

The  changes  could  be  addition  or  deletion  of  a  few  lines 
of  code  of  various  statement  types,  including  control 
statements,  FUNCTION,  DECLARATION  and  ASSIGNMENT. 

A  program  change  pattern  of  correction  can  be  seen, 
again,  in  Figure  3.1.  In  the  Figure,  we  observe,  among 
other   things,  that  one  FOR  statement  has  been  deleted  and 
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one  IF  statement  added  in  the  second  version.   The  trivial 
changes  indicate  that  the  program  is  under  correction. 

3.3.4   Pretty-Printing 

A  program  is  pretty  printed  when  its  codes  are 
displayed  with  proper  spacing  and  indentation.  The 
purpose  of  pretty  printing  is  to  make  the  structure  of  a 
program  explicit. 

Consider  two  versions  of  a  program  with  a  number  of 
statements  properly  indented  in  the  second  version  but  not 
in  the  first,  these  statements  are  identified  to  be 
changed  between  the  two  versions.  Nonetheless,  the 
changes  will  become  non-identifiable  when  both  versions  of 
the  program  are  converted  into  their  respective  pretty- 
printed  forms  since  pretty  printing  results  in  a  unique 
display  of  the  program.  The  analysis  leads  us  to  propose 
the  following  procedures  to  identify  whether  or  not  pretty 
printing  is  imposed  between  two  versions  of  a  program. 

(1)  Find  the  differences  between  two  successive 
versions  of  the  program  based  on  the  method 
outlined  in  section  2.3.4. 

(2)  Repeat  step  (1)  except  that  pretty  printing 
both  versions  of  the  programs  before  finding 
the  differences. 

(3)  Compare  the  results  of  steps  (1)  and  (2). 
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(4)  If  the  results  from  steps  (1)  and  (2)  are 
identical,  no  pretty  printing  is  involved  in 
developing  the  program  from  version  one  to 
version  two. 

(5)  If  the  results  show  a  general  trend  of  more 
statements  being  different  in  step  (1)  than 
those  in  step  (2),  we  conclude  that  attempts 
were  made  to  pretty  print  the  program  in 
version  two. 

A  result  of  the  analysis  based  on  the  proposed  procedures 
is  demonstrated  in  Figure  3.2.  In  the  Figure,  the  height 
of  a  bar  represents  the  degree  of  difference  between  two 
successive  versions  of  program.  A  bar  with  the  label 
"FOR"  of  the  height  of  three  means  that  three  FOR 
statements  are  different  between  the  two  versions.  The 
empty  bars  are  obtained  by  step  (1)  while  the  solid  ones 
by  step  (2).  The  fact  that  most  of  the  empty  bars  are 
higher  than  the  solid  ones  indicates  that  an  attempt  has 
been  made  to  put  the  second  version  of  the  program  into  a 
properly  indented  format. 

3.3.5   Reconstruction 

In  the  preceding  sub-section,  we  have  shown  that 
proper  indentations  is  capable  of  fully  expressing  the 
"structure"    of    a   program  written   in   a   structural 
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Figure  3.2  Sample  program  change  data  set  belonging  to  a  program  under  pretty-printing. 
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programming  language  such  as  C.  When  the  structure  of  the 
program  is  altered,  i.e.,  when  the  program  is 
reconstructed,  the  alternation  (reconstruction)  manifests 
itself  in  changes  of  the  indentation  patterns. 

Let's  consider  the  indentation  patterns  of  two 
successive  versions  of  a  program,  assuming  that  the  total 
lines  of  code  in  the  second  version  is  greater  than  that 
of  the  first  by  N.  If  the  program  has  been 
"reconstructed"  from  the  first  to  the  second  version,  we 
should  observe  that  the  change  in  the  count  of  indentation 
level  i  is  n.  (i  =  0  to  6)  with 

(1)  n.'s    not    showing  a  general   trend   of 

increasing,  and 

(2)  any  of   the  n.'s  in  the  second  version  being 

significantly  larger  or  smaller  than  the  n.'s 

in  the  first  version. 
Figure  3.3  depicts  an  example  of  reconstruction.  In  the 
Figure,  the  lines  of  code  of  the  second  version  increase. 
According,  we  expect  to  see  a  general  trend  of  increasing 
in  the  counts  of  each  indentation  level  in  the  second 
version.  However,  we  observe  that  the  counts  of 
indentation  levels  2,  3  and  4  decrease,  and  the  counts  of 
indentation  levels  0,  1,5  and  6  increase.  The  further 
analysis   tells   us   that  the  count  of  indentation  level  3 
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Figure  3.3  Sample  program  change  data  set  belonging  to  a  program  under  reconstruction. 
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significantly  decreases  and  the  count  of  indentation  level 
6  significantly  increases. 

3.3.6  Removing  Documentation 

Removing  documentation  is  the  reverse  process  of  that 
described  in  section  3.3.2;  it  results  in  a  decrease  in 
the  count  of  the  COMMENT  statements.  Figure  3.4  depicits 
an  example  of  removing  documentation.  In  the  Figure,  we 
observe  that  two  COMMENTS  are  removed  from  the  second 
version  of  the  program. 

3.3.7  Removing  Functionality 

Removing  functionality  is  concerned  with  the  deletion 
of  control  statements,  FUNCTION,  ASSIGNMENT,  include-file 
(PREPROCESSOR)  and  DECLARATION  from  a  program. 
Remembering  the  definition  of  correction,  the  removing 
functionality  is  recognized  when  the  lines  of  code  have 
significant  changes  between  two  successive  versions  of  a 
program. 

In  Figure  3.4,  we  find  that  the  counts  of  FUNCTION 
and  DECLARATION  in  the  first  version  are  more  than  those 
in  the  second  version.  This  change  pattern  typifies  the 
process  of  removing  functionality. 
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Figure  3.4  Sample  program  change  data  set  belonging  to  a  program  under  removing 
documentation  and  removing  functionality. 
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3.3.8  Adding  Functionality 

The  reverse  process  of  "removing  functionality"  is 
"adding  functionality".  This  process  involves  the 
addition  of  control  statements,  FUNCTION,  ASSIGNMENT, 
include-file  (PREPROCESSOR)  and  DECLARATION.  The  same  as 
removing  functionality,  the  addition  of  the  mentioned 
various  type  statements  with  significant  changes  of  lines 
of  code  on  the  second  version  of  a  program  are  typified  as 
adding  functionality.  An  example  showing  the  program 
change  pattern  of  adding  functionality  is  illustrated  in 
Figure  3.5.  In  the  Figure,  we  observe  that  the  counts  of 
FUNCTION  and  DECLARATION  increase  in  the  second  version  of 
the  program. 

3.3.9  Removing  Debugging 

When  software  development  reaches  its  final  stage, 
those  statements  inserted  for  the  purpose  of  debugging 
need  to  be  removed.  A  program  change  pattern  reflecting 
this  process  is  called  "Removing  Debugging";  the  pattern 
shows  a  decrease  in  the  OUTPUT  statements  between  two 
successive  versions  of  the  program.  Figure  3.5  gives  an 
example  of  removing  debugging.  Ten  output  statements  are 
removed  from  the  first  version  of  program  in  this  example. 
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Figure  3.5  Sample  program  change  data  set  belonging  to  a  program  under  adding 
functionality  and  removing  debugging. 


3-16 


3.3.10   Redistribution 

Redistribution  refers  to  changing  include-file 
(PREPROCESSOR)  and  FUNCTION.  Specifically,  a  program  is 
recognized  to  be  redistributed  when  either  one  of  the 
followings  happens. 

(1)  One  or  more  include-file  (PREPROCESSOR)  is 
added  in  conjunction  with  one  or  more 
FUNCTION  being  deleted. 

(2)  One  or  more  include-file  (PREPROCESSOR)  is 
deleted  in  conjunction  with  one  or  more 
FUNCTION  being  added. 

Figure  3.6  gives  an  example  of  redistribution.  In  the 
Figure,  we  observe  that  the  count  of  include-file 
(PREPROCESSOR)  decreases  with  simultaneous  increases  in 
the  count  for  FUNCTION.  The  observation  tells  us  that 
redistribution  has  taken  place. 

3.4   ADVANTAGES  OF  PROPOSED  CLASSIFICATION 

The  advantages  of  this  classification  are  outlined  as 
follows. 

(1)  Facilitate  the  identification  of  intuitive  rules 
for  program  change  analysis.  It  is  worth  noting 
that  a  myriad  of  changes  can  be  made  when  a 
program  progresses  from  one  version  to  the  next. 
To  extract  intuitive  rules  from  the  large  amount 
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Figure  3.6  Sample  program  change  data  set  belonging  to  a  program  under  redistribution. 
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of  change  data  is  almost  impossible.  The 
proposed  classification  collects  relevant  data 
in  small  groups.  Extracting  rules  from  each 
group  of  limited  amount  of  data  is  much  easier. 

(2)  Enhance  the  reliability  of  program  change 
analysis.  When  only  the  relevant  data  are 
collected  in  groups,  the  analysis  based  on  each 
small  group  is  more  "noise-free",  i.e.,  each 
step  of  analysis  will  not  be  influenced  by 
irrelevant  data.  Notice  that  Intuitive  rules 
derived  by  the  noise-free  analysis  are  more 
reliable . 

(3)  Render  the  progress  of  software  development  more 
assessable.  With  program  change  data  well 
organized,'  a  technical  or  non-technical  software 
manager  may  be  able  to  assess  the  progress  of 
the  software  development  at  a  glance. 
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Table  3.1  Sample  data  collected  for  preliminary  analysis  steps  (1)  and  (2).  Rows  1  to  23  are 
obtained  from  a  pair  of  non-pretty-printed  program;  rows  24-41  are  from  the  same 
pair  of  program  after  they  are  both  pretty-printed. 


ABC 

1  Version  1   Version  2 

2  FOR  5  8 

3  WHILE  0  0 

4  IF  4  8 

5  ELSE  1  1 

6  SWITCH  0  0 

7  CASE  0  0 

8  GOTO  0  0 

9  BREAK  0  0 

10  CONTINUE  0  0 

11  ASSIGNMENT  5  6 

12  PREPROCESSOR  6  6 

13  COMMENT  31  49 

14  BLANKLINE  45  59 

15  RETURN  0  0 

16  INPUT  0  0 

17  OUTPUT  4  0 

18  FUNCTION  14  20 

19  DECLARATION  4  3 

20  DEFAULT  0  0 

21  WEIGHT  2.61  3.55 

22  LOC  14  18.4 

23  WEIGHT/LOC  1.86  1.92 

24  ZERO  5.57  5.27 

25  ONE  1.5  0.82 

26  TWO  0.36  0.27 

27  THREE  0.57  0.43 

28  FOUR  0.5  0.71 

29  FIVE  0.36  0.65 

30  SIX  1.14  2.17 

38  TOTAL  AVE  2.46  3.11 

39  SUM  14  19 

40  LOC  14  19 

41  SUM/LOC  1  1 
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Table  3.2  Sample  data  collected  for  preliminary  analysis  steps  (3)  and  (4). 


A 

B 

C 

1 

WITH 

WITHOUT  CB 

2  FOR 

0 

0 

3  WHILE 

0 

0 

4  IF 

2 

3 

5  ELSE 

0 

1 

6  SWITCH 

0 

0 

7  CASE 

0 

0 

8  GOTO 

0 

0 

9  BREAK 

0 

0 

10  CONTINUE 

0 

0 

11  ASSIGNMENT 

1 

3 

12  PREPROCESSOR 

1 

1 

13  COMMENT 

4 

5 

14  BLANKLINE 

0 

1 

15  RETURN 

0 

0 

16  INPUT 

0 

0 

17  OUTPUT 

0 

0 

18  FUNCTION 

0 

0 

19  DECLARATION 

1 

1 

20  DEFAULT 

0 

0 

21  WEIGHT 

0 

.55 

0.69 

22  LOC 

; 

L.4 

1.9 

23  WEIGHT/LOC 

3 

.96 

3.64 

24  ZERO 

2 

.86 

25  ONE 

1 

.43 

26  TWO 

0 

.71 

27  THREE 

2 

.14 

28  FOUR 

0 

.71 

29  FIVE 

1 

.43 

30  SIX 

0 

.71 

31  TOTAL  AVE 

3 

.36 

32  SUM 

: 

L.4 

33  LOC 

1.4 

34  SUM/LOC 

1 

3-21 


Table  3.3  Classification  of  changes  and  its  description. 


Debugging 


Documentation 


Correction 


Pretty-printing 


Reconstruction 


Removing 
documentation 


Removing 
functionality 


Adding 
functionality 


Removing 
debugging 


Redistribution 


DESCRIPTION 


Output  statements  are  added  to  monitorthe  behavior  of  a  program. 


Comments  are  added  in  a  program  to  render  it  more  understandable. 


Errors  are  corrected  in  a  program. 


Indenting  statements  are  to  reflect  level  of  nesting. 


The  numberof  indentation  for  each  level  does  not  display 
the  consistent  with  the  trend  of  decreasing/increasing 
of  lines  of  code  of  a  program. 


Comments  are  removed  from  a  program. 


Function,  assignment,  declaration  and  preprocessor 
are  remved  from  a  program. 


Function,  assignment,  declaration  and  preprocessor 
are  added  from  a  program. 


Output  statements  are  removed  from  a  program. 
This  is  to  undo  the  debugging. 


Removing  function  plus  adding  preprocessor; 
adding Tunction  plus  removing  preprocessor. 
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CHAPTER  FOUR 
CLASSIFICATION  RULES 

The  changes  of  a  program  between  two  successive 
versions  are  readily  seen  when  the  change  pattern 
classification,  discussed  in  Chapter  3,  are  employed  to 
render  the  patterns  of  changes  explicit.  While 
discussions  in  the  previous  chapter  are  qualitative  in 
nature,  this  chapter  endeavors  to  shed  light  on  the 
quantitative  aspect  of  the  classification  by  using  PROBIT 
[Finneyl971]  and  propose  a  set  of  intuitive  rules  for 
program  progress  analysis  using  the  pattern 
classification.  PROBIT  is  a  statistical  procedure  which 
calculates  maximum-likelihood  estimates  of  the  intercept, 
slope,  and  natural  (threshold)  response  rate  for 
biological  assay  data. 

We  commence  this  chapter  by  giving  a  justification  of 
using  PROBIT  as  the  tool  for  quantitative  analysis  of  the 
classification.  This  gives  rise  to  the  identification  of 
certain  threshold  values  crucial  in  quantifying  each  type 
of  classification.  The  set  of  intuitive  rules  guiding  the 
use  of  the  classification  in  change  analysis  of  a  program 
is  then  presented.  An  example  will  be  given  to 
demonstrate  the  applicability  of  the  intuitive  rules. 


4-1 


4.1  PROBIT 

PROBIT  [Finneyl971]  is  a  statistical  procedure 
specialized  for  dose-response  problems  in  bioassays .  For 
some  stimulus-subject  systems,  measurement  of  a  response 
to  the  action  of  the  stimulus  is  impossible  or 
impractical;  all  that  can  be  done  is  to  record  whether  or 
not  the  subject  manifests  a  certain  reaction.  The  quantal 
response  so  used  can  be  death  or  any  other  easily 
recognizable  change  in  the  subject.  For  example,  an 
insecticide  (stimulus)  may  be  assayed  by  assigning  batches 
of  insects  (subjects)  to  various  doses  and  then  analyzing 
the  relation  between  death-rate  and  dose.  Note  that  each 
subject  can  be  used  only  once.  An  insect  that  has  died 
cannot  be  used  again;  even  insects  that  are  not  dead  may 
have  been  affected  by  the  stimulus  that  thereafter  they 
react  differently  from  others  not  previously  exposed  to 
the  stimulus. 

How  to  determine  a  threshold  value  for  each  type  of 
classification  is  basically  a  dose-response  problem.  The 
analogy  can  be  elaborated  as  follows  by  using  the  change 
pattern  class  of  debugging  as  an  example: 

1 .  The  dosage  in  this  case  is  the  number  of  output 
statements  showing  up  in  a  change  pattern.  The 
goal  is  to  determine  a  number  (dose)  beyond  which 
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the  change  pattern  can  be  classified  as 
debugging . 

A  subject  in  this  case  is  a  change  pattern 
between  two  successive  versions  of  a  program.  As 
mentioned  earlier,  64  pairs  of  programs  have  been 
analyzed,  i.e,  64  change  patterns  (subjects)  can 
be  identified.  A  batch  of  subjects  consists  of 
all  change  patterns  with  the  same  number  of 
output  statements  added.  This  definition 
conforms  with  the  requirement  that  all  subjects 
in  a  batch  receive  the  same  level  of  dose.  For 
example,  all  change  patterns  showing  an  increase 
of  2  output  statements  will  be  grouped  in  one 
batch  and  those  showing  an  increase  of  3  will  be 
grouped  in  another  batch.  Note  that  each  change 
pattern  (like  each  insect)  is  unique  in  its  own 
right  and  each  batch  of  change  patterns  can  not 
be  reused. 

A  change  pattern  (subject),  after  closely 
reviewed  by  a  software  engineer,  will  be 
determine  whether  or  not  it  responds  to  the  dose. 
The  response  is  quantal .  A  positive  response 
means  that  the  change  pattern  indeed  belongs  to 
the  class  of  debugging;  a  negative  response  means 
that   the   change   pattern  does  not  belong  to  the 
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class    of    debugging   although    some    output 

statements  were  added. 
The  discussion  above   has   explained   the   use   of   PROBIT 
procedure  for  quantitative  analysis  of  the  classification. 

4.2  QUANTITATIVE  STUDY  ON  THE  CLASSIFICATION 

A  quantitative  description  for  each  classification 
is  defined  in  this  section.  The  PROBIT  procedure  in  the 
Statistical  Analysis  System  [SAS1982]  is  employed  to  aid 
the  quantitative  analysis. 

4.2.1   Debugging 

Twenty-three  of  the  sixty-four  sets  of  programs  in 
the  current  research  have  been  observed  to  exhibit  an 
increase  in  output  statements.  The  increase  ranges  from  1 
to  32;  a  summary  of  the  changes  is  given  in  Table  4.1.  It 
appears  from  the  Table  that  the  number  of  output 
statements  added  for  debugging  is  independent  on  the  size 
of  the  program.  Further  statistical  analysis  yields  that 
the  sample  correlation  coefficient  between  the  size  of  the 
program  and  the  number  of  output  statements  added  for 
debugging  is  0.73,  implying  that  the  two  quantities  indeed 
do  not  significantly  correlate  with  each  other. 

To  determine  a  threshold  value  for  this  type  of 
change  pattern,  the   data   in   Table   4.1   is   reorganized 
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according  to  the  discussion  in  the  preceding  section  and 
tabulated  in  Table  4.2.  Note  that  in  the  Table,  data  are 
in  the  format  of  DOSE-SUBJECT-RESPONSE;  they  are  ready  for 
PROBIT  analysis.  To  read  the  Table,  the  6th  entry,  for 
example,  means  that  2  change  patterns  have  been  found  to 
experience  an  increase  of  9  output  statements;  only  one  of 
them  are  found  to  experience  debugging.  The  SAS  program 
incorporating  the  PROBIT  procedure  along  with  the  output 
for  this  analysis  is  reproduced  in  Appendix  K.  The  most 
important  information  contained  in  the  output  is  the  table 
listing  the  threshold  dose  along  with  the  95*  fiducial 
limits  for  different  probability  levels  (see  the  last 
table  in  Appendix  K) .  The  95*  fiducial  limits  are 
computed  using  a  t  value  of  1.96  since  the  chi-square  is 
small.  Note  that  in  all  the  analyses  in  this  study,  the 
chi-squares  are  small  indicating  that  the  linearity  of  the 
data  is  good.  However,  it  should  be  pointed  out  that  the 
width  of  95*  fiducial  intervals  can  be  very  big  as  the 
probability  level  increase.  This  is  expected  since  human 
factor  in  software  development  can  be  very  stochastic;  it 
is  extremely  difficult  to  interpret  to  a  high  precision  a 
program  change  pattern.  Note  that  in  some  extreme  cases, 
the  95*  fiducial  limits  will  be  marked  by  a  period  ( . )  in 
the  SAS  output . 

Based  on  the  result,  we  can  conclude  that 
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if   two  successive  versions  of  a  program  exhibit 

an   increase   of   more   than  5  (23)  lines  of 

output  statements, 
then  there   is   a  50  (90)  percent  chance  that  the 

program  has  experienced  a  change  in  terms  of 

debugging. 

4.2.2   Documentation 

Thirty-eight  of  the  sixty-four  sets  of  programs 
analyzed  in  the  current  research  have  been  observed  to 
possess  an  increase  in  comment  statements.  In  these  38 
sets  of  programs,  a  minimum  of  1  and  a  maximum  of  284 
comments  were  added  for  the  purpose  of  documentation. 
Table  4.3  summarizes  the  change  in  the  number  of  comment 
statements  in  the  38  sets  of  programs.  Notice  that  the 
correlation  coefficient  between  the  size  of  the  program 
and  the  number  of  comments  added  for  documentation  is  only 
0.44,  indicating  strongly  that  the  two  quantities  do  not 
correlate.  The  result  of  PROBIT  analysis  of  this  case  is 
given  in  Table  4.4,  based  on  which  we  conclude  that 

if   two  successive  versions  of  a  program  exhibit 

an  increase  of  more  than   1   (16)   lines  of 

COMMENT  statements, 
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then  there  is  a  50  (90)  percent  chance  that  the 
program  has  experienced  a  change  in  terms  of 
documentat  ion . 

4.2.3   Correction 

In  the  current  research,  16  of  the  64  sets  of 
programs  appear  to  possess  this  pattern  of  change,  i.e., 
they  have  gone  through  some  minor  changes  in  FUNCTION, 
DECLARATION,  ASSIGNMENT  and  control  statements.  Table  4.5 
gives  a  breakdown  of  the  activities  involved  in  those  16 
sets  of  programs.  Furthermore,  the  result  of  PROBIT 
analysis  of  this  case  is  shown  in  Table  4.6.  Based  on  the 
result,  we  conclude  that 

if  number  of  lines  of  code  exhibits  less  than 
10  lines  different  between  two  successive 
versions  of  a  program 
and  two  successive  versions  of  a  program  exhibit 
more  than  a  total  of  1  (4)  lines  of  change 
in  FUNCTION,  DECLARATION,  ASSIGNMENT  or 
control  statements, 
then  there  is  a  50  (90)  percent  chance  that  the 
program  has  experienced,  positively  or 
negatively,  a  change  in  terms  of  correction. 
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4.2.4  Pretty-Printing 

The  change  of  a  program  is  significant  in  terms  of 
pretty-printing  if  certain  percentage  of  changes  in 
statement  types  is  caused  by  print-printing.  The 
threshold  percentage  has  been  identified  by  PROBIT,  and 
the  result  is  presented  in  Table  4.7.  Based  on  the 
result,  we  conclude  that 

if   M   statement   types   are  changed  between  two 
successive  versions  of  a  program, 
and  N   statement   types   are   determined  to  have 

gone  through  pretty-printing, 
and  N/M  >  0.1  (0.7) 
then  there  is  a  50  (90)  percent  chance   that   the 
program   has    experienced,   positively  or 
negatively,  a  change   in  terms   of   pretty- 
printing. 
Detail  procedures  to  obtain  M  and  N  can  be   found   in 
Section  3.3.4. 

4.2.5  Reconstruction 

The  change  of  a  program  is  significant  in  terms  of 
reconstruction  if  changes  are  found  in  more  than  certain 
percentage  of  all  indentation  levels.  In  the  current 
research,  25  of  the  64  sets  of  programs  appears  to  have 
gone   through  different  degrees  of  reconstruction.   Table 
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4.8  summarizes  the  activity  of  those  25  sets  of  programs. 
In  an  attempt  to  determine  the  threshold  value,  input 
data  to  PROBIT  have  been  prepared  (Table  4.9)  based  on 
information  contained  in  Table  4.8.  However,  the  result 
(Table  4.10)  fail  to  yield  a  meaningful  interpretation 
the  response  decreases  as  the  dose  increases.  The  failure 
can  be  attributed  to  the  small  and  skew  set  of  data  found 
in  Table  4.9  —  the  only  3  non-100*  and  non-0*  response 
data  sets  are  all  of  50*  response.  While  more  data  are 
required  before  an  analytical  threshold  value  can  be 
identified,  we  define  subjectively  at  this  juncture  that 

if    I   is   the   highest  indentation  level  in  the 

second  version  of  a  program, 
and  changes  are  found  in  J  indentation  levels, 
and  J/I  >  1/2, 
then  the  program  has  gone  through,  positively  or 

negatively,    a   change   in    terms    of 

reconstruction. 
The   threshold  value  of   1/2   in  this   case  has   been 
determined  based  on  a  practitioner's   experiences  and 
heuristics. 

4.2.6   Removing  Documentation 

Nine  of  the  sixty-four  sets  of  programs   analyzed   in 
the  current  research  have  been  observed  to  have  a  decrease 
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in  the  comment  statements.  Table  4.11  summarizes  the 
change  in  the  number  of  comment  statements  in  these  9  sets 
of  programs.  Notice  in  the  Table  that  4  of  the  9  sets  of 
programs  experienced  removing  documentation  with  a 
decreasing  in  the  lines  of  code;  the  remaining  sets  of 
programs  experienced  removing  documentation  with  an 
increase  in  the  lines  of  code.  The  correlation 
coefficient  between  the  size  of  the  program  and  the  number 
of  comments  deleted  is  -0.097,  indicating  strongly  that 
the  two  quantities  do  not  correlate.  The  result  of  PROBIT 
analysis  (Table  4.12)  reveals  that 

if    two  successive  versions  of  a  program  exhibit 

a  decrease  of   more   than   2   (8)   lines   of 

COMMENT  statements, 
then  there  is  a  50  (90)  percent  chance   that   the 

program  has  experienced  a  change  in  terms  of 

removing  documentation. 

4.2.7   Removing  Functionality 

In  the  current  research,  16  of  the  64  sets  of 
programs  appear  to  possess  this  pattern  of  changes.  Table 
4.13  gives  a  detailed  description  of  the  change  of  these 
16  sets  of  programs.  It  is  interesting  to  note  from  the 
Table  that  FOR  and  ASSIGNMENT  are  changed  most  frequently. 
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A  PROBIT  analysis   on   the   data   derived  from  the  Table 

yields  that 

if  two  successive  versions  of  a  program  exhibit 
a  decrease  of  more  than  a  total  of  2  (27) 
lines  of  FUNCTION,  DECLARATION,  ASSIGNMENT, 
include-file  (PREPROCESSOR)  or  control 
statements , 
then  there  is  a  50  (90)  percent  chance  that  the 
program  has  experienced  a  change  in  terms  of 
removing  functionality. 

The   result  of  PROBIT  analysis  for  this  case  is  summarized 

in  Table  4.14. 

4.2.8   Adding  Functionality 

In  the  current  research,  37  of  the  64  sets  of 
programs  appear  to  possess  this  pattern  of  changes.  Table 
4.15  shows  a  detail  description  of  the  changes  in  these  37 
sets  of  programs.  A  PROBIT  analysis  on  the  data  derived 
from  the  Table  yields  that 

if  two  successive  versions  of  a  program  exhibit 
addition  of  more  than  a  total  of  4  (18) 
lines  of  FUNCTION,  DECLARATION,  ASSIGNMENT, 
include-file  (PREPROCESSOR)  or  control 
statements , 
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then  there  is  a  50  (90)  percent  chance   that   the 
program  has  experienced  a  change  in  terms  of 
adding  functionality. 
The   result  of  PROBIT  analysis  for  this  case  is  summarized 
in  Table  4.16. 

4.2.9   Removing  Debugging 

Twenty-two  of  the  sixty-four  sets  of  programs  in  the 
current  research  have  been  observed  to  exhibit  a  decrease 
in  the  output  statement.  The  decrease  ranges  from  1  to 
25.  A  summary  of  the  change  in  output  statements  for  all 
the  22  sets  of  programs  is  given  in  Table  4.17.  Observe 
that  the  removal  of  output  statements  can  occur  regardless 
of  direction  of  change  in  the  total  lines  of  code. 

Similar  to  the  case  of  debugging,  the  size  of  the 
program  is  found  to  be  independent  of  the  number  of 
statements  deleted  for  removing  debugging;  the  correlation 
coefficient  between  the  two  quantities  is  merely  0.4186. 
Using  the  PROBIT  procedure,  we  conclude  from  the  result 
(Table  4.18)  that 

if  two  successive  versions  of  a  program  exhibit 
a  decrease  of  more  than  1  (10)  OUTPUT 
statement, 
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then  there  is  a  50  (90)  percent  chance  that  the 
program  has  experienced  a  change  in  terms  of 
removing  debugging. 

4.2.10   Redistribution 

In  current  research,  11  of  the  64  sets  of  programs 
appear  to  possess  this  pattern  of  changes.  Table  4.19 
summarizes  the  activities  of  these  11  sets  of  programs  in 
terms  of  redistribution.  From  the  Table,  we  obtain  two 
sets  of  data  as  described  in  Table  4.20.  The  result  of 
PROBIT  analysis  of  data  set  (a)  is  given  in  Table  4.21;  it 
gives  rise  to  the  threshold  values  of  using  FUNCTION  in  a 
quantitative  definition  of  the  change  pattern  of 
redistribution.  However,  data  set  (b)  is  an  invalid  data 
set  for  the  PROBIT  procedure.  This  is  because  that  the 
change  patterns  of  the  11  sets  of  programs  bear  too  much 
similarity  among  them  —  9  of  them  all  have  one 
preprocessor  removed.  In  the  light  of  the  partial 
information  attained,  we  define  semi-subject ively  that 

if  the  changes  in  PREPROCESSOR  and  FUNCTION  are 
in  opposite  directions  (the  former  increases 
while  the  latter  decreases  or  the  other 
round) , 
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and  two  successive  versions  of  a  program  exhibit 
more   than  3  lines  of  change  in  PREPROCESSOR 
or  3  (45)  lines  of  change  in  FUNCTION, 
then  there   is   a  50  (90)  percent  chance  that  the 
program  has   gone   through,   positively   or 
negatively,     a    change    in    terms    of 
redistribution. 
As  in  the  case  of  reconstruction,  the  threshold  value  of  3 
for  PREPROCESSOR  is  determined  based   on   the   experiences 
and  heuristics  of  a  practitioner. 

4.3  INTUITIVE  RULES 

A  set  of  intuitive  rules  is  proposed  to  guide  the  use 
of  the  classification  in  change  analysis  of  a  program. 
The  set  of  rules  can  be  best  understood  by  the  graphic 
representation  depicted  in  Figure  4.1.  Literally,  the 
Figure  implies  that  in  analyzing  the  change  of  a  program, 
a  software  analyzer  should  abide  by  the  following  rules. 

(1)  Focus  on  the  pattern  change  in  pretty- 
printing.  If  the  program  is  found  to  have 
experienced  a  significant  change  in  terms  of 
pretty-printing,  then  the  analysis  should  be 
halted.  Otherwise,  proceed  with  the 
analysis . 

(2)  Check    the   pattern   change   in   terms   of 
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REDISTRIBUTION 


REMOVING 
FUNCTIONALITY 


REMOVING 
DOCUMENTATION 


REMOVING 
DEBUGGING 


PRETTY-PRINTING 


CORRECTION 


RECONSTRUCTION 


ADDING 
FUNCTIONALITY 


DOCUMENTATION 


DEBUGGING 


Figure  4.1  Graphic  representation  of  the  intuitive  rules  to  guide  the  use  of  the  classification  in 
change  analysis  of  a  program. 
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redistribution  and  reconstruction 
simultaneously.  If  the  program  is  found  to 
not  experience  any  significant  change  in 
terms  of  both  types  of  classification,  then 
proceed  with  the  analysis.  Otherwise,  halt 
the  analysis. 

(3)  Examine  the  pattern  change  in  correction.  If 
the  program  is  found  to  have  experienced  a 
significant  change  in  terms  of  correction, 
then  the  analysis  should  be  halted. 
Otherwise,  proceed  with  the  analysis. 

(4)  Concentrate  on  the  pattern  change  in  terms  of 
removing/adding  functionality.  If  the 
program  is  found  to  not  experience  any 
significant  change  in  terms  of  both  types  of 
classification,  then  proceed  with  the 
analysis.   Otherwise,  halt  the  analysis. 

(5)  Study  the  pattern  change  in  terms  of 
removing/adding  documentation.  If  the 
program  is  found  to  not  experience  any 
significant  change  in  terms  of  both  types  of 
classification,  then  proceed  with  the 
analysis.   Otherwise,  halt  the  analysis. 

(6)  Direct  final  attention  to  the  pattern  change 
in  terms  of  removing/adding  debugging. 
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The  ordering  of  the  rules  is  based  on  the  previous 
research  on  the  weight  of  a  program,  i.e.,  the  frequencies 
of  change  of  individual  statement  types  [Gustafsonl985] . 
An  example  is  given  to  demonstrate  an  application  of  the 
intuitive  rules  in  the  following  section. 

4.4  EXAMPLE 

Current  example  comprising  two  successive  versions  of 
the  C  module  "Recreate_Listing"  developed  by  students  in  a 
Software  Engineering  class;  neither  version  is  final.  The 
module,  upon  completion,  accepts  data  array  records  and 
counter  arrays  as  inputs;  it  recreates  and  prints  out  a 
list  containing  the  index  values  of  individual  entity 
names  of  the  data  arrays .  The  source  codes  of  the  two 
versions  of  the  module  are  given  separately  in  Appendices 
I  and  J;  their  sizes  are  3824  and  6136  bytes, 
respectively. 

As  a  preparation  for  change  analysis,  both  versions 
of  the  module  are  used  as  inputs  to  the  CHANGES  program 
(described  in  Section  2.3)  yielding  an  output  file, 
ma  i  n  .  resu I ts .  Data  contained  in  the  file  ma  i  n .resu Its  is 
then  processed  by  Excel  on  an  Apple  Macintosh  to  generate 
change  patterns  between  the  two  successive  versions  of  the 
module.  Figures  4.2  to  4.4  are  reproductions  of  those 
change   patterns.     In    the    following    sub-section, 
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classification  of  these  patterns  are  discussed.  Changes 
analysis  on  the  module  will  then  be  examined  based  on  the 
classification  and  the  intuitive  rules. 

4.4.1  Identifying  Program  Change  Patterns 

Figure  4 . 2  contrasts  the  changes  in  terms  of  the 
number  of  occurrence  of  each  type  of  statement  between  the 
two  versions  of  the  module.  The  solid  bars  are  for  the 
first  version  and  the  empty  ones  for  the  second.  The 
pattern  of  changes  in  this  Figure  indicates  that 
redistribution,  adding  functionality  and  debugging  have 
occurred  when  the  module  "Recreate_Listing"  progressed 
between  the  two  versions.  First,  the  decrease  in  the 
number  of  PREPROCESSOR  in  conjunction  with  the  increase  in 
the  number  of  FUNCTION  indicate  the  existence  of 
redistribution.  Secondly,  the  increases  in  the  numbers  of 
IF,  ASSIGNMENT  and  DECLARATION  statements  imply  the 
presence  of  adding  functionality.  Finally,  the  increase 
in  the  number  of  OUTPUT  statements  suggests  that  debugging 
was  involved  as  the  development  of  the  module  progressed. 

Figure  4.3  compares  the  indentation  levels  between 
the  two  versions  of  the  module.  Again,  the  solid  bars  are 
for  the  first  version  and  the  empty  ones  for  the  second. 
Note  that  the  number  of  the  ZERO  indentation  level  in  the 
second  version  is  actually  greater  than  that  in  the   first 
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Figure  4.2  Program  differences  represented  in  a  bar  chart  in  terms  of  various  statement  types 
and  lines  of  code. 
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version,  although  the  empty  bar  for  the  ZERO  indentation 
level  is  shorter  than  the  solid  one  in  the  Figure.  This 
is  due  to  the  normalization  scheme  adopted  (see  Section 
2.4).  The  consistent  increases  in  the  numbers  of  all 
indentation  levels  conclude  that  reconstruction  did  not 
happen  in  this  case. 

Figure  4.4  collates  the  differences  between  the  two 
versions  of  the  module  with  and  without  pretty-printing. 
The  former  are  represented  in  solid  bars  and  the  latter  in 
empty  bars.  In  the  Figure,  more  differences  in  the  FOR 
and  OUTPUT  statements  are  detected  between  the  two 
versions  of  the  module  when  they  are  not  pretty-printed. 
This  hints  of  the  occurrence  of  pretty-printing. 

In  summary,  the  change  patterns  exposited  in  Figures 
4.2  to  4.4  lead  us  to  conclude  that  redistribution,  adding 
functionality,  debugging  and  pretty-printing  are  major 
activities  occur  between  the  two  versions.  The  quality  of 
the  program  need  be  analyzed;  this  is  discussed  in  the 
next  sub-section. 

4.4.2  Analyzing  Program  Changes 

The  significance  of  the  change  patterns  identified  in 
the  preceeding  sub-section  can  be  appreciated  when  they 
are  examined  quantitatively  and  discussed  in  the  context 
of  the  newly  proposed  intuitives  rules.    The   explanation 
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Figure  4.4  Program  differences  represented  in  a  bar  chart  in  terms  of  various  statement  types 
and  lines  of  code;  the  programs  were  pretty-printed  before  comparison. 
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can  be  best  understood  with  reference  to  Figure  4.1. 

(1)  The  change  pattern  of  pretty-printing  is 
examined.  From  Figure  4.4,  seven  statements 
are  changed  between  the  two  versions;  two  of 
them  are  identified  to  have  gone  through 
pretty-printing.  The  ratio  of  N/M  is  2/7, 
indicating  that  there  exists  a  75%  chance 
(see  Table  4.6)  for  the  program  to  have  gone 
through  a  significant  change  in  terms  of 
pretty-printing.  Considering  the  case  when 
a  software  development  should  be  interrupted 
only  if  we  are  80%  sure  that  the  development 
is  abnormal,  we  should  proceed  the  analysis. 

(2)  The  change  pattern  of  redistribution  is 
examined.  From  Figure  4.2,  the  number  of 
PREPROCESSOR  decreases  by  1  while  the  number 
of  FUNCTION  increases  by  29.  Since  more  than 
17  lines  of  change  in  FUNCTION  are  detected 
(see  Table  4.19  for  the  significance  of  the 
threshold  value  of  17),  we  are  at  least  80% 
sure  that  a  significant  change  with  respect 
to  redistribution  exists.  The  quality  of  the 
software  is  in  doubt;  the  analysis  should  be 
terminated. 
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Table  4.1   Increasing  of  OUTPUT  statements  between  two  versions  of  a  program. 


VERSION  1 

lines  of                                  no  of 
code                                 OUTPUT 

VERSK3N2 

lines  of                                  no  .of 
code                                 OUTPUT 

Average 
sue 

Increasing 
ofOUTPUT 

1 

448                             9 

455                           18 

451 

9 

2 

412                         10 

455                           28 

433 

18 

3 

62             j              0 

64              j               2 

63 

2 

4 

64             i               2 

72             1               5 

68 

3 

5 

103            !              11 

107            j              19 

105 

8 

6 

15                           0 

39                            2 

27 

2 

7 

424                          13 

453                          16 

438 

3 

8 

438                          3 

453                          4 

445 

1 

9 

148            j              0 

241            i               1 

194 

1 

10 

377            i               1 

447            j             20 

412 

19 

11 

94            !              2 

118            j               5 

106 

3 

12 

128                           2 

138                          10 

133 

8 

13 

413                          0 

1125                         3 

769 

3 

14 

1125                         3 

1485           j              8 

130 

5 

15 

1485          i              8 

2104           i             32 

1794 

24 

16 

413            i              0 

2498           j              32 

1455 

32 

17 

34                           7 

50                           16 

42 

9 

18 

112                          8 

47                           10 

79 

2 

19 

34                           7 

68                           18 

51 

11 

20 

51                            1 

53             j               3 

52 

2 

21 

69            i              3 

104            i              10 

86 

7 

22 

47             i             10 

i 

68             j              18 

57 

8 

23 

82            !              l 

94             !               2 

88 

' 

CORRELATION COEFFICIENT   -0.7348 
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Table  4.2  Input  data  for  PROBIT  analysis  of  Debugging. 


DOSE 


1 

32 

2 

24 

3 

19 

4 

18 

5 

11 

6 

9 

7 

8 

8 

7 

9 

5 

10 

3 

11 

2 

12 

1 

13 

0 

N 

RESPONSE 

1 

1 

1 

1 

1 

1 

1 

1 

1 

0 

2 

1 

3 

2 

1 

1 

1 

0 

4 

2 

4 

1 

3 

0 

41 

0 
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Table  4.3  Increasing  of  COMMENT  statements  between  two  versions  of  a  program. 


VERSION  1 

lines  of                                  no.  of 
code                               COMMENT 

VERSION2 

lines  of                                no  of 
code                              COMMENT 

Average 
size 

Increasing 
of  COMMENT 

1 

431                         6 

438                             41 

434 

35 

2 

438         j             41 

427            j              45 

432 

4 

3 

431                      6 

427                          45 

429 

39 

4 

433         j             3 

437            j             46 

435 

43 

5 

437         !            46 

458            !             189 

447 

143 

6 

413         j             6 

458                         189 

435 

183 

7 

448         !             5 

455            !             46 

451 

41 

8 

412                      5 

455                          46 

433 

41 

9 

431         1            46 

452            I             184 

441 

138 

10 

64                      22 

72                           25 

68 

3 

11 

107         i            18 

107                          26 

107 

8 

12 

106                     24 

113                          32 

109 

8 

13 

55                      12 

79                           48 

67 

36 

14 

55                      11 

79                           48 

67 

37 

15 

15                      13 

59                           24 

37 

11 

16 

453         !             7 

438            1             41 

445 

34 

17 

438                     41 

453                          48 

445 

7 

18 

453         i            48 

474            I             191 

463 

143 

19 

424                      7 

474                         191 

449 

184 

20 

148         i            26 

241                          38 

194 

8 

21 

241                     38 
I 

377                          55 

309 

17 

22 

377         j            55 

447            j             66 

412 

11 

23 

447         !            66 

424                          67 

435 

1 

24 

424                     67 

424                         68 

424 

1 

25 

424         !            68 

472            !             123 

448 

55 

26 

141                     29 

184                          49 

162 

20 

27 

47          1             0 

68             (               2 

57 

2 

28 

413                     45 

1125                        139 

769 

94 

29 

1125                   139 

1485                      181 

1305 

42 

30 

1485                   181 

2104                       275 

1794 

94 

31 

2104       i          275 

2328                      325 

2216 

50 

32 

413                     45 

2498                      329 

1455 

284 

33 

51                       1 

112                         29 

81 

28 

34 

51          !             0 

53             !              1 

52 

1 

35 

53                       1 

69                          16 

61 

15 

36 

104         j            15 

120            I             39 

112 

24 

37 

120                     39 

129                          42 

124 

3 

38 

51          i             0 

129                          42 

90 

42 

CORRELATION  COEFFICIENT   =04443 
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Table  4.4  Result  of  PROBIT  for  quantitative  analysis  of  Documentation. 


0 

.01 

0 

.02 

0 

.03 

0 

.04 

0 

.05 

0 

.06 

0 

.07 

0 

.08 

0 

.09 

0 

.10 

0 

.15 

0 

.20 

0 

.25 

0 

.30 

0 

.35 

0 

.40 

0 

.45 

0 

.50 

0 

55 

0 

60 

0 

65 

0 

70 

0. 

75 

0. 

80 

0. 

85 

0. 

90 

0. 

91 

0. 

92 

0. 

93 

0. 

94 

0. 

95 

0. 

96 

0. 

97 

0. 

98 

0. 

99 

PROBIT 

ANALYSIS  ON  DOSE 

DOSE 

95  PERCENT  FIDUCIAL  LIMITS 

LOWER 

UPPER 

0.00620648 

0.00000000 

0.2 

0.01127852 

0.00000000 

0.3 

0.01647545 

0.00000000 

0.4 

0.02191030 

0.00000000 

0.4 

0.02762861 

0.00000000 

0.5 

0.03365733 

0.00000000 

0.5 

0.04001656 

0.00000000 

0.6 

0.04672365 

0.00000000 

0.6 

0.05379485 

0.00000000 

0.7 

0.06124619 

0.00000000 

0.8 

0.10479359 

0.00000000 

1.0 

0.16059017 

0.00000000 

1.3 

0.23161256 

0.00000000 

1.6 

0.32180129 

0.00000000 

2.0 

0.43645257 

0.00000000 

2.4 

0.58280540 

0.00000000 

2.8 

0.77095668 

0.00000000 

3.4 

1.01533637 

0.00000000 

4.1 

1.33718011 

0.00000001 

5.0 

1.76887162 

0.00000021 

6.3 

2.36201597 

0.00000446 

8.2 

3.20355436 

0.00010671 

11.5 

4.45100179 

0.00285048 

19.1 

6.41949590 

0.07394092 

50.3 

9.83750916 

1.00026825 

511.3 

16.83219814 

4.22058317 

59435.9 

19.16369066 

5.15367069 

217351.9 

22.06394066 

6.20764355 

916930.7 

25.76202996 

7.41705783 

4584563.9 

30.62952678 

8.83913182 

28319660.3 

37.31306265 

10.57008970 

230781234.6 

47.05129507 

12.78260853 

2769176875.2 

62.57236442 

15.82471467 

59949523007.6 

91.40452554 

20.55097808 

3653992392227.4 

166.10176166 

30.07277334 

2452809210542120.0 
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Table  4.5  Data  reflectig  the  activity  of  Correction. 
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E 

1 
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E 
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E 

sw 
rr 

CH 
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E 

G 
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T 
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B 
R 
E 
A 
K 

CO 

NT 
IN 
UE 

AS 

SI 
GN 
ME 

NT 

RE 
TU 

RN 

FU 
NC 
Tl 
ON 

DE 
CL 
AR 
AT 
O 
N 

T 
0 

T 
A 

L 

1 

2 

+  1 

+  1 

.1 

3 
4 
4 
14 

_2 

+  1 

+  1 

_1 

.2 

.1 

_1 

+2 

.7 

+  3 

+  1 

5 
6 

8     1 

+3 

5 
1 
6 

+  3 

+2 

+  1 

+4 

+2 

2 

10 

_1 

+  1 

2 

13 
14 

1 

+  1 

.1 

_1 

4 

16 

.1 

_2 

_1 

.4 
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Table  4.6  Result  of  PROBIT  for  quantitative  analysis  of  Correction. 


PROBABILITY 


0.01 
0.02 
0.03 
0.04 
0.05 
0.06 
0.07 
0.08 
0.09 
0.10 
0.15 
0.20 
0.25 
0.30 
0.35 
0.40 
45 
50 
55 
60 
65 


0 

0 

0 

0 

0 

0.70 

0.75 

0.80 

0.85 

0.90 

0.91 

0.92 

0.93 

0.94 

0.95 

0.96 

0.97 

0.98 

0.99 


PROBIT  ANALYSIS 

ON  DOSE 

DOSE 

95  PE 

0.11150498 

LOW 

0.14676349 

0.17471297 

0.19919418 

0.22161661 

0.24267929 

0.26278796 

0.28220275 

0.30110245 

0.31961652 

0.40918897 

0.49796445 

0.58932682 

# 

0.68557383 

0.78873887 

0.90095162 

# 

1.02469095 

1.16304948 

1.32008982 

1.50139481 

1.71499611 

1.97306844 

2.29530379 

2.71642701 

3.30576874 

4.23220962 

4.49243800 
4.79330578 
5.14743551 
5.57395772 
6.10371266 
6.79078119 
7.74232225 
9.21676148 
2.13115403 

1. 

2. 
2. 
2. 
2. 
3. 
3. 
3. 
4. 

74922256 
07677798 
32312810 
54892358 
77624313 
02257870 
31128553 
68865254 
29788066 

95  PERCENT  FIDUCIAL  LIMITS 

UPPER 
0.72822395 
0.82456998 
0.89289236 
0.94845581 
0.99655331 
1.03971308 
1.07936067 
1.11638816 
1.15139578 
1.18480976 
1.33686615 
1.47672275 
1.61418733 
1.75578850 
1.90784821 
2.07858204 
2.28140763 
2.54354863 
2.93984267 
3.94117784 
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Table  4.7  Result  of  PROBIT  for  quantitative  analysis  of  Pretty-printing. 


PROBABILITY 

0 

.01 

0 

.02 

0 

.03 

0 

.04 

0 

.05 

0 

.06 

0 

.07 

0 

.08 

0 

.09 

0 

.10 

0 

.15 

0 

.20 

0 

.25 

0 

.30 

0 

.35 

0 

.40 

0 

.45 

0 

50 

0 

55 

0 

60 

0 

65 

0 

70 

0. 

75 

0. 

80 

0. 

85 

0. 

90 

0. 

91 

0. 

92 

0. 

93 

0. 

94 

0. 

95 

0. 

96 

0. 

97 

0. 

98 

0. 

99 

PROBIT  ANALYSIS 

ON  DOSE 

DOSE 

95  PEF 

T   AMDr 

0.00359507 

LOWEF 

0.00534399 

0.00687219 

• 

0.00830356 

0.00968507 

• 

0.01104060 

0.01238436 

0.01372576 

! 

0.01507146 

, 

0.01642650 

0.02346113 

0.03114454 

0.03971307 

0.04939942 

0.06047244 

0.07326641 

0.08821543 

0.10590236 

0.12713546 

0.15307572 

0.18546148 

0.22703320 

0.28240854 

0.36010509 

0.47803794 

0.68275689 

0. 

18320956 

0.74414212 

0. 

22122090 

0.81709926 

0. 

24590687 

0.90560228 

0. 

26828284 

1.01582463 

0. 

29072952 

1.15799995 

0. 

31478702 

1.35066218 

0. 

34217866 

1.63198366 

0. 

37566671 

2.09867538 

0. 

42117163 

3.11963954 

0. 

49765526 

95  PERCENT  FIDUCIAL  LIMITS 

UPPER 
0.06292549 
0.07275113 
0.07986129 
0.08573071 
0.09087535 
0.09554341 
0.09987576 
0.10396120 
0.10786001 
0.11161548 
0.12917096 
0.14613311 
0.16384922 
0.18370446 
0.20806020 
0.24337088 
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Table  4.8  Data  relecting  the  activity  of  Reconstruction 


level 

1 

2 

3 

4 

5 

"6  " 

1 

X 

X 

X 

X 

2 

X 

X 

X 

X 

X 

X 

X 

3 

X 

X 

X 

X 

X 

X 

- 

4 

X 

X 

X 

X 

X 

X 

5 

X 

X 

X 

X 

- 

6 

X 

- 

7 

X 

X 

X 

X 

X 

X 

8 

X 

X 

X 

- 

9 

X 

- 

10 

X 

- 

11 

X 

_ 

12 

X 

X 

X 

_ 

13 

X 

_ 

14 

X 

X 

X 

X 

- 

15 

X 

X 

X 

X 

16 

X 

X 

_ 

17 

X 

_ 

18 

X 

X 

19 

X 

X 

20 

X 

X 

X 

21 

X 

_ 

22 

X 

X 

- 

23 

X 

_ 

24 

X 

_ 

25 

X 

- 

x  indicates  the  activity  of  reconstruction  happening  inthat  level. 
-  indicates  the  highest  indentation  level  in  the  second  version  of  the  program 
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Table  4.9  Input  data  for  PROBIT  analysis  of  Reconstruction. 


DOSE 


RESPONSE 


1 

1/4 

1 

0 

2 

1/5 

2 

2 

3 

1/6 

2 

1 

4 

1/7 

4 

4 

5 

2/5 

1 

1 

6 

1/3 

1 

1 

7 

2/7 

3 

3 

8 

3/4 

1 

0 

9 

3/7 

2 

2 

10 

4/7 

4 

2 

11 

6/7 

2 

2 

12 

7/7 

2 

1 

13 

0 

39 

0 

*  Only  3  sets  of  data  are  of  neither  100%  nor  0%  response; 
moreover,  all  three  sets  of  data  are  of  50%  response. 
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Table  4.10  Result  of  PROBIT  for  quantitative  analysis  of  Reconstruction. 


PROBIT  ANALYSIS  ON 

DOSE 

ROBABILITY 

DOSE 

95  PER 
LOWER 

0.01 

107.02127855 

3.19345977 

0.02 

64.43615097 

2.61281373 

0.03 

46.70147304 

2.29890731 

0.04 

36.65770134 

2.08698947 

0.05 

30.10380060 

1.92848786 

0.06 

25.45716481 

1.80263013 

0.07 

21.97701845 

1.69867638 

0.08 

19.26666462 

1.61037745 

0.09 

17.09305821 

1.53378888 

0.10 

15.30967137 

1.46626766 

0.15 

9.70147845 

1.21437523 

0.20 

6.75106227 

1.04203536 

0.25 

4.94630019 

0.91069888 

0.30 

3.74079577 

0.80373410 

0.35 

2.88766100 

0.71229031 

0.40 

2.25877368 

0.63080071 

0.45 

1.78100395 

0.55498907 

0.50 

1.40959181 

0.48032362 

0.55 

1.11563429 

0.39857046 

0.60 

0.87965832 

0.65 

0.68808252 

0.70 

0.53115679 

. 

0.75 

0.40170410 

, 

0.80 

0.29431651 

. 

0.85 

0.20480889 

. 

0.90 

0.12978391 

• 

0.91 

0.11624304 

• 

0.92 

0.10312886 

. 

0.93 

0.09041031 

. 

0.94 

0.07805068 

. 

0.95 

0.06600326 

0.96 

0.05420277 

0.97 

0.04254575 

. 

0.98 

0.03083594 

. 

0.99 

0.01856593 

m 

95  PERCENT  FIDUCIAL  LIMITS 

UPPER 


36582117 
33588262 
30979057 
28597182 
26342504 
24137926 
21909146 
19562651 
16938581 
13625452 
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Table  4.1 1  Decreasing  of  COMMENT  statements  between  two  versions  of  a  program. 


VERSION  1 

lines  of                             no  of 
code                          COMMENT 

VERSON2 

lines  of                               no  of 
code                            COMMENT 

Average 
size 

Decreasing 
of  COMMENT 

1 

413             i              6 

433            j               3 

423 

3 

2 

107                          26 

106                         24 

106 

2 

3 

66             j            12 

65             |              11 

55 

1 

4 

140                        31 

128                          29 

134 

2 

5 

34                           5 

50                             1 

42 

4 

6 

112            i            29 

47             j               0 
i 

79 

29 

7 

34                           5 

68                         2 

51 

3 

8 

69             i            16 

104           i             15 

86 

1 

9 

3              !             3 

82             !               2 

82 

1 

CORRELATONCOEFFtCIENT  =-0097 
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Table  4.12  Result  of  PROBIT  for  quantitative  analysis  of  Removing  documentation. 


PROBABILITY 

0 

.01 

0 

.02 

0 

.03 

0 

.04 

0 

.05 

0 

.06 

0 

.07 

0 

.08 

0 

.09 

0 

.10 

0 

.15 

0 

.20 

0 

.25 

0 

.30 

0 

.35 

0 

.40 

0 

.45 

0 

.50 

0 

55 

0 

60 

0 

65 

0. 

70 

0. 

75 

0. 

80 

0. 

85 

0. 

90 

0. 

91 

0. 

92 

0. 

93 

0. 

94 

0. 

95 

0. 

96 

0. 

97 

0. 

98 

0. 

99 

PROBIT  ANALYSIS 

ON  DOSE 

DOSE 

95  PE1 

0.13030754 

LOWEI 

0.17775860 

0.21646897 

0.25105154 

0.28321788 

0.31382343 

0.34336805 

0.37217437 

0.40046572 

0.42840516 

0.56639627 

0.70712877 

0.85542583 

1.01492749 

1.18916239 

1.38207456 

1.59845569 

1.84445561 

2.12831455 

2.46152892 

2.86085107 

3.35197985 

3.97698593 

4.81102824 

6.00642465 

7.94111938 

8.49515025 

9.14092096 

9.90778395 

10.84054333 

12.01201172 

13.55106821 

15.71595481 

19.13840772 

26.10759590 

1. 
2. 
2. 
2. 
2. 
2. 
2. 
3. 
3. 
3. 
3. 
4. 

70248914 
12566036 
54177023 
63795270 
74215642 
85698878 
98619432 
13551855 
31454916 
54127628 
85630633 
39075496 

95  PERCENT  FIDUCIAL  LIMITS 

UPPER 
0.80370316 
0.91427178 
99479347 
06197710 
12166640 
17669897 
22872196 
27881655 
32776572 
37618831 
62847301 
96506841 
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Table  4.14  Result  of  PROBIT  for  quantitative  analysis  of  Removing  functionality. 


PROBIT  ANALYSIS 

ON  DOSE 

PROBABILITY 

DOSE 

95  PER 

0.01 

0.02395598 

bUWCiK 

0.02 

0.04071631 

m 

0.03 

0.05700605 

m 

0.04 

0.07342869 

# 

0.05 

0.09021888 

# 

0.06 

0.10750231 

m 

0.07 

0.12536044 

m 

0.08 

0.14385356 

m 

0.09 

0.16303094 

# 

0.10 

0.18293583 

. 

0.15 

0.29473461 

, 

0.20 

0.43057952 

• 

0.25 

0.59605259 

. 

0.30 

0.79820524 

• 

0.35 

1.04626445 

. 

0.40 

1.35258011 

. 

0.45 

1.73404900 

# 

0.50 

2.21436417 

. 

0.55 

2.82772210 

. 

0.60 

3.62522608 

. 

0.65 

4.68658635 

. 

0.70 

6.14304244 

. 

0.75 

8.22646987 

2.24033975 

0.80 

11.38792836 

3.19656253 

0.85 

16.63669120 

4.26226337 

0.90 

26.80398156 

5.72511159 

0.91 

30.07655233 

6.11252324 

0.92 

34.08611333 

6.55190307 

0.93 

39.11448321 

7.05932424 

0.94 

45.61212236 

7.65889260 

0.95 

54.35013947 

8.38906047 

0.96 

66.77783194 

9.31649800 

0.97 

86.01558798 

10.57161076 

0.98 

120.42861238 

12.46387882 

0.99 

204.68411661 

16.06642642 

95  PERCENT  FIDUCIAL  LIMITS 
UPPER 
0.40045266 
0.51378816 
0.60324999 
0.68177921 
0.75409786 
0.82255650 
0.88853750 
0.95294891 
1.01643884 
1.07950239 
1.40084269 
1.75940050 
2.20105839 
2.82142270 
3.94918746 
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Table  4. 15  Data  reflecting  the  activity  of  Adding  functionality. 
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3 

22 

1 

7 

1 

41 

36 

48 

19 

3 

164 

164 

71 
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1 

2 

3 

2 

4 
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Table  4.16  Result  of  PROBIT  for  quantitative  analysis  of  Adding  functionality. 


PROBIT  ANALYSIS  ON 

DOSE 

PROBABILITY 

DOSE 

95  PERCENT 

FIDUCIAL  LIMITS 

LOWER 

UPPER 

0.01 

0.26808408 

0.00000001 

1.47812955 

0.02 

0.36936075 

0.00000004 

1.77241290 

0.03 

0.45264301 

0.00000013 

1.99015161 

0.04 

0.52745363 

0.00000031 

2.17232077 

0.05 

0.59733604 

0.00000063 

2.33345832 

0.06 

0.66406487 

0.00000117 

2.48061574 

0.07 

0.72867910 

0.00000201 

2.61781631 

0.08 

0.79185061 

0.00000326 

2.74760260 

0.09 

0.85404564 

0.00000507 

2.87170005 

0.10 

0.91560541 

0.00000759 

2.99134348 

0.15 

1.22139833 

0.00004039 

3.54881858 

0.20 

1.53575437 

0.00015210 

4.07659075 

0.25 

1.86918974 

0.00047309 

4.60437314 

0.30 

2.22989200 

0.00130673 

5.15190852 

0.35 

2.62599173 

0.00333845 

5.73730363 

0.40 

3.06672228 

0.00809516 

6.38150063 

0.45 

3.56341526 

0.01896681 

7.11295625 

0.50 

4.13071118 

0.04350960 

7.97527760 

0.55 

4.78832064 

0.09869916 

9.04281925 

0.60 

5.56384743 

0.22288519 

10.45760469 

0.65 

6.49764989 

0.50170866 

12.53017326 

0.70 

7.65183912 

1.11350679 

16.06254945 

0.75 

9.12843382 

2.34594831 

23.56311870 

0.80 

11.11035412 

4.37163329 

44.42163783 

0.85 

13.96986914 

6.93743229 

121.08893896 

0.90 

18.63551121 

9.92407412 

534.47594844 

0.91 

19.97876234 

10.62048345 

779.42474703 

0.92 

21.54797205 

11.37624164 

1180.08747554 

0.93 

23.41603453 

12.21193568 

1870.80595601 

0.94 

25.69443971 

13.15786387 

3144.17898370 

0.95 

28.56478380 

14.26138255 

5710.10150025 

0.96 

32.34933628 

15.60288038 

11564.99302784 

0.97 

37.69587611 

17.33606657 

27682.34437384 

0.98 

46.19541944 

19.81618926 

88886.38417365 

0.99 

63.64710283 

24.23088995 

564320.95848118 
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Table  4.17  Decreasing  of  OUTPUT  statements  between  two  versions  of  a  program. 


VERSION  1 

lines  of                                 no  of 
code                                 OUTPUT 

VERSK3N2 

lines  of                                  no  of 
code                                 OUTPUT 

Average 
size 

Decreasing 
of  OUTPUT 

1 

432                      25 

438                            3 

435 

22 

2 

438                       3 

427                            0 

432 

3 

3 

431          j            25 

427              j              0 

429 

25 

4 

433          j            11 

437             i             1 

435 

10 

5 

437          j             1 

458             j             0 

447 

1 

6 

413                      11 

458                           0 

435 

11 

7 

412                      10 

450                           9 

431 

1 

8 

72                         5 

i 

67               j              2 

69 

3 

9 

67           i             2 

65               j             0 

66 

2 

10 

106          i            19 

113             j             0 

109 

19 

11 

82           !             2 

82              !             1 

82 

1 

12 

59                         2 

57                            0 

58 

2 

13 

453                      16 

438                           3 

448 

13 

14 

453                       4 

474                           3 

463 

1 

15 

424          i            13 

474             i             3 

449 

10 

16 

447          j            20 

424             j             1 

i 

435 

19 

17 

140          !             4 

128             !             2 

134 

2 

18 

141                      10 

184                           0 

162 

10 

19 

118                       5 

114                           2 

116 

3 

20 

51                       16 

112                           8 

81 

8 

21 

120          i           10 

129             i            0 

124 

10 

22 

51           j             1 

129             i            0 

90 

1 

CORRELATONCOEFFICIENT   =0.4186 
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Table  4.18  Result  of  PROBIT  for  quantitative  analysis  of  Removing  debugging. 


PROBIT 

ANALYSIS  ON  DOSE 

PROBABILITY 

DOSE 

95  PERCENT 

FIDUCIAL  LIMITS 

LOWER 

UPPER 

0.01 

0.00987305 

0.24106379 

0.02 

0.01665264 

0.31012384 

0.03 

0.02320211 

0.36431487 

0.04 

0.02977738 

0.41155477 

0.05 

0.03647776 

0.45472663 

0.06 

0.04335614 

0.49525537 

0.07 

0.05044645 

0.53396770 

0.08 

0.05777351 

0.57139492 

0.09 

0.06535732 

0.60790426 

0.10 

0.07321522 

0.64376403 

0.15 

0.11715101 

0.81925989 

0.20 

0.17021341 

0.99807044 

0.25 

0.23452432 

1.18948229 

0.30 

0.31274365 

1.40222998 

0.35 

0.40833837 

1.64746143 

0.40 

0.52593629 

1.94230557 

0.45 

0.67185443 

2.31725434 

0.50 

0.85493131 

2.83633204 

0.55 

1.08789571 

3.67134794 

0.60 

1.38972642 

5.58354746 

0.65 

1.78995562 

1 

0.70 

2.33708203 

. 

0.75 

3.11655334 

• 

0.80 

4.29406555 

• 

0.85 

6.23902032 

1. 

36525446 

• 

0.90 

9.98300049 

3. 

01750086 

• 

0.91 

11.18325487 

3. 

36149229 

m 

0.92 

12.65125828 

3. 

73474799 

. 

0.93 

14.48878139 

4. 

15060798 

m 

0.94 

16.85822524 

4. 

62788362 

• 

0.95 

20.03707294 

5. 

19562959 

^ 

0.96 

24.54573128 

5. 

90343461 

• 

0.97 

31.50177217 

6. 

84766259 

• 

0.98 

43.89139787 

8. 

25673647 

• 

0.99 

74.03059791 

10. 

92480081 

• 
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Table  4.19  Data  reflecting  the  activity  of  Redistribution. 


PREPROCESSOR 

FUNCTION 

1 

1 

+  2 

2 

-1 

+6 

3 

-1 

+  7 

4 

-1 

+9 

5 

-1 

+  7 

6 

-1 

+  7 

7 

-1 

+3 

8 

.2 

+  16 

9 

+3 

_1 

10 

_1 

+  1 

11 

_1 

+4 
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Table  4.20  Input  data  for  PROBIT  analysis  of  Redistribution. 


dose* 

15 
9 

7 
5 
3 
1 
0 


(a) 
CTIO 

N 

N 

response 

1 

1 

1 

1 

3 

1 

1 

1 

2 

1 

3 

1 

53 

0 

dose** 

3 
2 
1 
0 


PR] 

(b) 
CPROCE 

SSOE 

i 

* 

N 

res 

ponse 

1 

1 

9 

53 

1 
1 
4 
0 

*  Only  odd  levels  of  dose  are  adopted  here.   The  dose  of  level 
one  designates  that  one  to  two  FUNCTIONS  have  been  changed 
between  versions,  the  dose  of  level  three  designates  that 
three  to  four  FUNCTIONS  have  been  change,  etc. 

**  The  negative  signs  are  dropped  from  Table  4.19.   The  dose  of 
level  three  designates  that  three  PREPROCESSORS  have  been 
changed;  the  change  can  be  in  the  positive  or  the  negative 
direction. 
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Table  4.21  Result  of  PROBIT  for  quantitative  analysis  of  Redistribution. 


PROBIT 

ANALYSIS  ON  DOSE 

PROBABILITY 

DOSE 

95  PERCENT 

FIDUCIAL  LIMITS 

LOWER 

UPPER 

0.01 

0.01972508 

0.67262515 

0.02 

0.03537278 

0.84156771 

0.03 

0.05123919 

0.97293020 

0.04 

0.06771205 

1.08724836 

0.05 

0.08494580 

1.19196457 

0.06 

0.10302922 

1.29077852 

0.07 

0.12202613 

1.38587420 

0.08 

0.14198960 

1.47870054 

0.09 

0.16296802 

1.57031014 

0.10 

0.18500803 

1.66152904 

0.15 

0.31280264 

2.13434947 

0.20 

0.47483315 

2.69705785 

0.25 

0.67928979 

3.52061375 

0.30 

0.93693942 

. 

0.35 

1.26218787 

• 

0.40 

1.67464902 

• 

0.45 

2.20157775 

0.50 

2.88177827 

0.55 

3.77213387 

• 

0.60 

4.95903671 

• 

0.65 

6.57956410 

• 

0.70 

8.86358907 

• 

0.75 

12.22548334 

• 

0.80 

17.48960860 

3 

35368468 

• 

0.85 

26.54915551 

4 

77313522 

• 

0.90 

44.88802984 

6. 

41999307 

• 

0.91 

50.95874648 

6. 

83526977 

• 

0.92 

58.48770418 

7. 

29963746 

• 

0.93 

68.05629157 

7. 

82864452 

* 

0.94 

80.60476406 

8. 

44532285 

0.95 

97.76405682 

9. 

18603696 

• 

0.96 

122.64650271 

10. 

11324035 

m 

0.97 

162.07604704 

11. 

34788431 

0.98 

234.77502165 

13. 

17366686 

• 

0.99 

421.01958619 

16. 

55821929 

• 
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CHAPTER  FIVE 
CONCLUSIONS  AND  EXTENSIONS 

A  classification  of  program  change  patterns  and  a  set 
of  intuitive  rules  for  effective  evaluation  of  the  program 
change  patterns  have  been  proposed.  First,  data 
concerning  the  pattern  change  between  two  successive 
versions  of  a  program  are  identified  and  collected.  Based 
on  these  data,  certain  criteria  are  derived  to  classify 
the  change  patterns.  This  has  given  rise  to  the  ten 
classes  of  program  change  patterns.  Further  quantitative 
study  on  the  classification  yields  the  set  of  intuitive 
rules.  The  classification  and  the  rules  have  been 
demonstrated  to  be  capable  of  facilitating  the  program 
change  analysis,  enhancing  the  reliability  of  program 
change  analysis  and  rendering  the  progress  of  software 
development  more  assessable.  In  summary,  the  proposed 
technique  can  help  a  software  manager  analyze  the  progress 
of  a  program  during  software  development  stage. 

Extensions  of  the  current  research  may  include: 

(1)  Collecting  and  analyzing  more  change  patterns 
between  program  sets.  More  data  will  lead  to 
better  statistical  results.  This  is 
especially  needed  in  the  cases  of 
RECONSTRUCTION  and  REDISTRIBUTION. 
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(2)  Refining  the  definition  of  the  change  pattern 
Debugging.  A  statement  of  the  type  output 
statements  (defined  in  Section  3.3.1)  may  be 
added  for  the  purposes  other  than  debugging. 
The  content  of  the  output  statement  need  be 
taken  into  consideration  in  determining  if  it 
is  for  debugging. 

(3)  Identifying  the  average  time  required  for  a 
program  to  progress  from  one  version  to  the 
next.  It  appears  that  the  time  may  be 
dependent  on  the  types  of  the  change  patterns 
involved. 

(4)  Extending  the  analysis  to  predict  the 
progress  of  a  program  during  its  development. 
While  this  work  has  been  aimed  at  determining 
the  quality  of  a  software  development  at  its 
present  state  based  on  the  historical  data, 
attempts  can  be  made  to  extend  the  analysis 
to  predict  the  quality  of  the  program  when 
continuing  development  is  compulsory. 
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APPENDIX  A 
SOURCE  CODE  OF  SAMPLEPROGRAM1 

/*  */ 

/*  Procedure     :  SyntaxCheck  Last  Revision :  4/30/86  */ 

/*  •/ 

/*  Programmer    :  Monte  L.  Hall  */ 

/* 

/*  Description  :  This  module  accepts  an  EntityName  which  consists  of  all  */ 

/*  those  characters  found  after  the  colon  on  one  line  as  distinguished  */ 

/*  in  the  Read  Data  module.  This  EntityName  string  is  tested  for  a  null  */ 

/*  string,  an  oversized  string(one  that  is  over  15  characters  long),  and  */ 

/*  for  embedded  blanks  within  the  EntityName.  A  ConditionCode  is  passed  */ 

/*  out  indicating  which  error  occurred;  or  if  no  error  occurred,  a  */ 

/*  ConditionCode  is  passed  out  indicating  such.  */ 

/*  •/ 

/*******************************♦*************************♦*********♦#***************/ 


#include  </usrb/cs340/ldb/project/define.h  > 

#include  <  ctype.h  > 

#define  NoEntityNameCode  2 

#define  TooLongEntityName  Code  1 

#define  BlanksInEnlilyNameCode  8 

#define  OK_Code  0 

Syntax_Check(EntityName,ConditionCode) 

char  *  EntityName; 

int  *ConditionCode; 


char  *ch; 

int  j,  i  =  0, 1  =  0,  Ch_Count  =  U, 
Space  =  0,  Flag  =  0,  Error  =  0; 
char  Temp[MAX_STR_LEN]; 

ch  =  EntityName; 

printf("\nln  Syntaxcheck"); 

printf("\nBefore  isspace  and  ch  is:  %s",ch[I|); 

while  (isspace(ch|Ij) !  = 
1  =  1  +  1; 

j  =  i; 

printf("\nAfter  isspace  and  ch  is:  %s",ch|I]); 

while  (ch[I| !  =  '\0'  &&  ch|I| !  =  '\n'  &&  ch|Ij !  =  '\t'  &&  Space  =  =  0) 


A-l 


if(ch(I 
{ 

| !  =  "  &.&.  Space  =  =  0) 

ChCount  =  ChCount  +  I; 
Tempfi|  =  ch[l]; 
i  =  i  +  1; 

} 
else 

{ 

1  =  1  +  1; 

Flag  -  1; 

printf("\nReading  characters"); 

prinif("\nRead  a  space  after  characters"); 
if(Flag  =  =  1) 

Space  =  1; 

1  =  1  +  1; 


} 


while  (iscntrl(ch|I|)  !=()&&  ch[I| !  =  '\n' 

&&  ch[I] !  =  '\t'  &&  ch[I|  !  =  '\0'  &&  Error  =  =  0) 

{ 

if(ch(Il=  =  ") 

( 

prinlf( "\nReading  spaces  after  all  characters"); 
1  =  1  +  1; 


} 
else 

{ 


printf("\aError  occurred); 
Error  =  1; 


} 


if  (Error  =  =  1) 

*ConditionCode  =  BlanksInEntityNameCode; 
else 

if  (Ch  Count  =  =  0) 

*ConditionCode  =  NoEntityNarneCode; 
else 

if(Ch_Count  >  MAX_STR  LEN  -  1) 

*ConditionCode  =  TooLongEntityName  Code; 
else 

*CondilionCode  =  OKCode; 
printf("\nConditionCode  in  SC  =  %d",*ConditionCode); 
for  (i  =  0;  iAXSTR  LEN  - 1 ;  +  +  i) 

{ 

EntityName[i]  =  chfjj; 

J  =  J  +  i; 
} 
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EnlityName|MAX_STR_LEN-  1|  =  '\0'; 
printf("\nENTITY  NAME  IS:  %s",EntityName); 
prinlf("\nLeaving  SyntaxCheck"); 
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APPENDIX  B 
SOURCE  CODE  OF  SAMPLEPROGRAM2 

/'  •/ 

/*  Procedure     :  SyntaxCheck                Last  Revision :  5/05/86  */ 

/*  */ 

/*  Programmer    :  Monte  L.  Hall  "l 

I*  7 

/*  Description  :  This  module  accepts  an  EnlityName  which  consists  of  all  */ 

/*  those  characters  found  after  the  colon  on  one  line  as  distinguished  */ 

/*  in  the  ReadDala  module.  This  EnlityName  string  is  tested  for  a  null  */ 

/*  string,  an  oversized  string(one  that  is  over  MAX  STR  LEN  characters  */ 

/*  long),  and  for  embedded  blanks  within  the  EnlityName.  A  Condition-  */ 

/*  Code  is  passed  out  indicating  which  error  occurred;  or  if  no  error  */ 

/*  occurred,  a  ConditionCode  is  passed  out  indicating  such.  */ 

/*  *l 


#include  </usrb/cs340/ldb/project/define.h  > 

#include  <ct  ype.h> 

#define  NoEntityNameCode  2 

#define  TooLongEntityNameCode  1 

#define  BlanksInEntityNameCode  8 

#define  OK  Code  0 


Syntax_Check(EntityName,ConditionCode) 

char  *EntityName; 

int  *ConditionCode; 


char  *ch; 

int    i  =  0, 1  =  0,  ChCounl  =  0,       /*  initialize  indices,  counters  */ 

Space  =  0,  Flag  =  0,  Error  =  0,  /*  and  booleans  */ 

Null  =  0; 

char  Tempi  MAX  STRLEN]; 

ch  =  EntityName; 

printf("\nln  Synlaxcheck"); 

printf("\nENTITY  NAME  is:  %s",EntityName); 

for(i  =  0;i<  =  MAX  STR  LEN;  +  +i) 
Temp[i)  =  ' '; 

i  =0; 

printf("\nBefore  isspace  and  ch  is:  %s",ch|I|); 
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while  (isspace(ch[I  |) !  =  0)  /*  strip  off  initial  blanks  in  EntityName  */ 
1  =  1+1; 

if(iscntrl(ch|I|)  =  =  0&&(ch|ll!=  >')) 
I 

printf("\nAfter  isspacc  and  ch  is:  %s",ch(I]); 

while  (ch[lj !  =  '\0'  &&  ch[I] !  =  '\n'  &&  ch[l) !  =  '\l'  &&  Space  =  =  0 

&&  (iscnlrl(ch|i|  =  =  0)) 

{         /*  continue  until  eol  or  space  encountered  */ 

if(ch|I]!=  "&&  Space  =  =  0) 

{ 

ChCount  =  ChCount  +  1; 

Temp[i]  =  ch[I]     /*  store  EntityName  w/o  initial  blanks  */ 

i  =  i  +  1; 

1  =  1  +  1; 

Flag  =  1;  /*  reading  characters  */ 

printf("\nReading  characters"); 
}  /*  end  inner  if  */ 
else 
{  /*  read  a  space  */ 

printf("\nRead  a  space  after  characters"); 

if  (Flag  =  =  1)  /*  read  a  space  after  characters?  */ 
Space  =  1;    /*  turn  flag  on  */ 

1  =  1  +  1; 

}    /*  end  else*/ 
}  I*  end  while  */ 

while(iscntrl(ch[l|)  =  =  0  &&ch[I]!=  '\n' 

&&  chfl] !  =  '\t'  &.&  ch[I| !  =  '\0'  &&  Error  =  =  0) 

{      /*  continue  until  eol  or  error  occurs  */ 

if  (ch[I]  =  =  ")  /*  read  a  space  after  characters  */ 

1  =  1  +  1; 
else    /*  blanks  between  characters  */ 
Error  =  1;   /*  set  flag  */ 

}  /*  end  while  */ 
}  /*  end  outer  if  */ 
else 

Null  =  1; 

/*  set  ConditionCode  */ 

if  (Error  =  =  1) 

*ConditionCode  =  BlanksInEntityNameCode; 
else 
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if  (ChCount  =  =  0) 

*ConditionCode  =  NoEntityNameCodc; 
else 

if(Ch_Count  MAX  STR  LEN  -  1) 

*CondilionCode  =  TooLongEntityNameCode; 

else 

*CondilionCode  =  OKCode; 
printff\nConditionCode  in  SC  =  %d",*ConditionCode); 
printf("\nTEMP  IS:  %s",Temp); 
if  (Null  =  =  0) 

{ 

for  (i  =  0;  iAXSTRLEN  - 1;  +  +i)    /*  copy  entity  name  w/o  initial  */ 
EntityName(i)  =  Tempfi];  /*  blanks  in  Temp  back  into  */ 

I*  EntityName  */ 

EntityNamefMAXSTRLEN  -  1 1  =  '\0'; 
}    /*endif*/ 
else 

EntityNamejO]  =  '\0'; 
printf("\nENT!TY  NAME  after  loop  IS:  %s",EntityName); 
printf("\nLeaving  SynlaxCheck"); 
}  /*  end  SyntaxCheck  module  */ 
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APPENDIX  C 
SOURCE  CODE  OF  COUNT  PROGRAM 


BEGIN  {} 

{count  =  0} 

/  {count  =  1} 

/ {count  =  2} 

/ {count  =  3} 

/ {count  =  4} 

/ {count  =  5} 

/ {count  =  6} 
print  count} 
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APPENDIX  D 
SOURCE  CODE  OF  NESTING  PROGRAM 


"main. results" 
\n" 


>  "main. results" 


>  "main.results" 
>"main.results" 

>  "main.results" 
>  "main.results" 


BEGIN  {  printf "Levels:    \n"  > 
printf" 

one  =  t) 

two  =  0 

three  =  0 

four  =  0 

five  =  0 

six  =  0 

sum  =  0       } 

AV  {zero  =  zero  +  1} 

III  {one  =  one  +  1} 

121  {two  =  two  +  1} 

III  {three  =  three  +  1} 

/4/{four  =  four  +  1} 

/5/{five  =  five  +  1} 

161  {six  =  six  +  1} 
END  {  printf  "zero    %6d\n",  zero 

printf  "one    %6d\n",  one 

printf  "two    %6d\n",  two 

printf  "three  %6d\n",  three 

printf  "four    %6d\n",  four    >  "main.results" 

printf  "five    %6d\n",  five    >  "main.results" 

printf  "six    %6d\n",  six     >  "main.results" 

printf"- — \n"  >  "main.results" 

zeroave  =  (zero  *  100)  /  NR 

printf  "ZERO  %  =  %5.3f\n",  zeroave  >  "main.results" 

oneave  =  (one  •  100)  /  NR 

printf  "ONE  %=    %5.3f\n",  oneave   >  "main.results" 

twoave  =  (two  *  100)  /  NR 

printf  "TWO  %  =    %5.3f\n",  twoave  >  "main.results" 

threeave  =  (three  *  100)  /  NR 

printf  "THREE  %  =  %5.3f\n",  threeave  >  "main.results" 

fourave  =  (four  *  100)  /  NR 

printf  "FOUR  %  ■   %5.3f\n",  fourave  >  "main.results" 

fiveave  =  (five  *  100)  /  NR 

printf  "FIVE  %  =    %5.3f\n",  fiveave  >  "main.results" 

sixave  =  (six  *  100)  /  NR 

printf  "SIX  %=    %5.3f\n",  sixave   >  "main.results" 

average   =  100  *  (zero  +  one*2  +  two*3  +  three*4)  ,'NR 

average  +  =  100  *  (four*5  +  five*6  +  six*7)  /  NR 

printPTOTAL  AVERAGE  =  %5.3f\n",average  >  "main.results" 

sum  =  zero  +  one  +  two  -I-  three  +  four  +  five  +  six 

printf  "SUM  =         %10d\n",  sum    >  "main.results" 

printf  "LINES  OF  CODE  =   %10d\n",  NR  >  "main.results" 

printf'SUM/LINES  :  %10.3f\n",(sum/NR)>  "main.results" 

printf"- \n"  >  "main.results"     } 
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APPENDIX  E 
SOURCE  CODE  OF  TYPEPGM  PROGRAM 

BEGIN  {  CommentSw  -  0;  StringSw  =  0;  LineNumber  =  0  } 

{ 

# 

#  process  all  the  number  of  fields  in  the  current  record. 
# 

i  =  1 

if(NF  =  =  0) 
{     counl[  "blanklines"]  +  + 
LineNumber  =  NR 

} 

else  {  while  (i  <  =  NF) 

{   if  (CommentSw  =  =  1) 

{  if  (Si  ~  /\*\//)       CommentSw  =  0 


else  { 

if  ((Si  ~  /VUV)  &&  ($i !  ~  /V)  &&  (Si  I  =  "/")) 
{  CommentSw  =  1 
countfcomments"]  +  + 
if  (Si  ~  /\*\//)  CommentSw  =  0 

} 
else  { 

if  (StringSw  =  =  1) 

{if($i~/\7)         StringSw  =  0 

} 
else  { 

if($i~/\7) 
{  StringSw  =  1 

if($i~/\"\)/)  StringSw  =  0 

} 
else  { 


if  ((($1  ~  /\:/)  1 1  ($2  ~  /\:/))  &&  (Si  =  =  $1)) 

{ if  ($1  ~  /default/)       countfdefault"]  +  +   #  ...  default 

else  if  ($1 !  =  "case")    count["labels"]  +  +    #  ...  labels 
} 
if  (Si  "  A(/) 

{  NoOfElement  =  split($i,  Array, "(") 
count["functions"|  = 
count["functions"]  +  NoOfElement  -  1 
for  (k  =  1;  k  NoOfElement;  k  +  + ) 
{   if(Array|kl=  =  "if") 

{count["if]  +  +  count|"functions"]--} 
else  if  (Array( k |  =  =  "for") 
{count["for"]  +  + 
countf'assignments"]— 
count["functions"]- 

} 
else  if  (Array[k]  =  =  "while") 

{count|"while"|  +  + 
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count]"functions"]- 

} 
else  if  (Array[kj  =  =  "switch") 
{countf'switch"]  +  + 
counlffunctions"]-- 

} 
else  if  (Array[k]  =  =  "return") 
{counlfreturn"]  +  + 
count|"functions"]-- 

) 

else  if  ((Array(k]  =  =  "gelchar")  1 1 
(Array[k]  =  -  "getc")) 
{count["input"j  +  + 
count["functions"]-- 

} 
else  if  ((Arrayfk]  =  =  "scanf")  |  | 
(Array[k]  =  =  "fscanf')) 
{count["input"]  +  + 
countf'f  unctions"]-- 

} 
elseif((Array[k]  --  "gets")  || 
(Array[  k]  =  =  "fgets")) 
{count["input"j  +  + 
count["functions"]- 

} 
else  if  ((Array|k|  =  =  "gefw")  1 1 
(Array[k]  =  =  "read")) 
{count["input"]+  + 
count["functions"]- 

} 
else  if  ((Array[k|  =  =  "pulchar")  |  | 
(Arrayfk]  =  =  "putc")) 
{countfoutput"]  +  + 
count("functions"]~ 

} 
else  if  ((Arrayfk]  =  =  "printf)  1 1 
(Array(k]  =  =  "fprintf)) 
(count["output"]  +  + 
count["functions"]- 

} 
else  if  ((Array[k]  =  =  "printw")  1 1 
Array[k]  =  =  "write")) 
(countfoutput"]  +  -r 
count["functions"]— 

} 
else  if  ((Array[k|  =  ■  "puts"|  |  | 
(Array[k]  =  =  "fputs") 
{countfoutput"]  +  + 
count["f  unctions"]— 
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}  #  end'if  ($i  ~  /\(/)' 
if($i~/\  =  /) 

{  if  (($i ! "  /\!  =/)  &&  (Si !  ~  /\  =  =/)  &&  ($i !'  A  =  /)) 
{count["assignmenls"|+  + 

}end'if($i'/\=/)' 
if  ($i  =  =  "int")  countf'declarations"]  +  + 
else  if  ($i  =  =  "float")  countf'declarations"]  +  + 
else  if  ($i  =  =  "double")  count["declaralions"'|  +  + 
else  if  ($i  =  =  "struct)  count|'declarations")  +  + 
else  if  ($i  =  =  "register")  countf'declarations"]  +  + 
else  if  ($i  =  =  "static")  count["declarations"]  +  + 
else  if  ($i  =  =  "char")  countf'declarations"]  +  + 
else  if  ($i  =  =  "if)  {count["if']  +  +  count  ["functions"]--} 
else  if  ($i  =  =  "for") 

{counl["for"|  +  + 

countf'assignments"]-- 

counl["f  unctions"]- 

} 
else  if  ($i  =  =  "while") 

{count["whiIe"]  +  +  countffunctions"]--} 
else  if  ($i  =  =  "switch") 

{counl|"switch"]+  +  count|"funclions"]--} 
else  if  (($i  =  =  "return")  1 1  ($i  =  =  "return;")) 
{count["return"]  +  + 

if  (($i  =  =  "return")  &&  ($(i  + 1)  ~  /\(/) 
{  count["functions"]-  } 

} 
else  if  (($i  =  =  "getchar")  |  |  (Si  =  =  "getc")) 

{count|"input"]+  +  count["functions"]-} 
else  if  (($i  =  =  "scanf")  1 1  ($i  =  =  "fscanf*)) 

{count["input"]  +  +  count["functions"]-} 
else  if  ((Si  =  =  "gets")  1 1  (Si  =  =  "fgets")) 

{countf'input"]  +  +  count["functions"]-} 
else  if  ((Si  =  =  "getw")  1 1  (Si  =  =  "read")) 

{count["input"J  +  +  count["functions"]~} 
else  if  ((Si  =  =  "putchar")  1 1  ($i  =  =  "putc")) 

{countf'output"]  +  +  count["functions"]--} 
else  if  ((Si  =  =  "printf )  1 1  (Si  -  -  "fprintf')) 

{countf'output"] -t-  +  count["functions"]~] 
else  if  ((Si  =  =  "printw")  |  |  (Si  =  =  "write")) 

{count["output"]+  +  count["functions"]--} 
else  if  (($i  =  =  "puts")  1 1  (Si  =  =  "fputs")) 

{count["outpul"|-t-  +  count["functions"]-} 
else  if  (Si  =  =  "else")  count["else"]  +  + 
else  if  (Si  ~  A#/)  count["preprocessor"]  +  + 
else  if  (Si  =  =  "case")  count["case"]  +  + 
else  if  (Si  =  =  "goto")  countf'goto"]  +  + 
else  if  ((Si  =  =  "break")  |  |  (Si  =  =  "break;")) 

count["break"]  +  + 
else  if  ((Si  =  =  "continue")  1 1  (Si  =  =  "continue;") 
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countf'continue"]  +  + 


LineNumber  =  NR 
+  +i 


END 


%10d\n",  count["else"J 
%10d\n",  count["switch" 


"CASE 
"GOTO 
"BREAK 


>  >  "main. results"; 

>  >  "main. results"; 

>  >  "main. results"; 
>  >  "main. results"; 

>  >  "main. results"; 


" \n"  >  >   "main. results"; 

"FOR  %10d\n",count["for"]  >>  "main.results"; 

"WHILE        %10d\n",  count("while"]  >>  "main.results"; 

"IF  %10d\n",  count["if']  >>  "main.results"; 

"ELSE 
"SWITCH 

%10d\n",  count["case"| 

%10d\n",  count["goto") 
%10d\n",count|"break"] 
"CONTINUE     %10d\n",  count["continue"|  "main.results"; 

"ASSIGNMENT  %10d\n",  count["assignments"|      >  "main.results"; 
"PREPROCESSOR  %10d\n",count["preprocessor")     >>   "main.results"; 
"COMMENT     %10d\n",  count["comments"]         >  >  "main.results"; 
"BLANKLINE    %10d\n",  count["blanklines"|       >>  "main.results"; 
"RETURN       %10d\n",  countf  return"]  >>  "main.results"; 

"INPUT        %10d\n",  countf'input"]  >  >  "main.results"; 

"OUTPUT       %10d\n",  countf'output"]  >  >  "main.results"; 

"FUNCTION     %10d\n",count["functions"]        >    'main.results"; 
"DECLARATION  %10d\n",  count["declarations"]     >  >  "main.results"; 
"DEFAULT      %10d\n",  count|"default"]  >>  "main.results" 


print 

print 

print 

print 

print 

print 

print 

print 

print 

print 

print 

print 

print 

print 

print 

print 

print 

print 

print 

print 
# 

#  calculate  the  weights 
# 

weights   =  18.4  *  countfdeclarations"]  +  11.4  *  count["if] 
weights  +  =   7.9  *  count("for"j  +   8.5  *  count["while"| 

weights  +  =  6.8  *  countfswitch")        +  5.6  *  count["case"J 
weights  +  =  4.6  *  count["preprocessor"]  +  11.1  *  counlfgoto"] 
weights  +  =  2.4  *  count|"comments") 

printf " \n"  >  >  "main.results" 

printf  "WEIGHT/LINES  =   %10.5f\n",  (weights/NR)  >  "main.results"; 

printf  "WEIGHT  =         %10.5f\n",  weights       >>  "main.results" 

printf  "LINES  OF  CODE  =   %10d\n",  NR  >>   "main.results"; 

printf" \n"  >  >  "main.results"; 
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APPENDIX  F 

SOURCE  CODE  OF  CHANGES  PROGRAM 

echo  "%*%  This  is  start  of  data  collection''  >  >  main. results 

date  >  >  main. results 

******  Pretty-Printing  a  Program  ****** 

cb 

<$l>l.cb 

cb 

<$2>  2.cb 

******  Calculating  Occurrence  of  Statement  Types  ****** 

echo "=  =  =  =  =  =  =  =  =  =  =  =  =  =  =  =  =  =  =  =  =  =  =  =  =  =  =  =  =  ="   >>  main. results 

echo  "File  Name  :  "  $1  >  >  main. results 
sed  W "  /g 

s/}/}  /g 

s/{/{/g'$l  I 
awk-fTYPEPGM 

******  Summing  the  Indentation  Level  ****** 
awk  -f  COUNT  l.cb  |  awk  -f  NESTING 
******  Calculating  Occurrence  of  Statement  Types  ****** 
echo "==============================">>  main.results 

echo  "File  Name  :  "  $2  >  >  main.results 
sed  's/7 "  /g 

s/}/}  /g 

s/{/{/g'$2| 
awk-fTYPEPGM 

******  Summing  the  Indentation  Level  ****** 
awk  -f  COUNT  2.cb  |  awk  -f  NESTING 
******  Finding  the  Differences  ****** 
diff -e  l.cb  2.cb  |  grep '  *  [0-9]'  | 
******  Extracting  the  Changed  Statements  ****** 
sed's/V/g 

s/a/  a /g 

s/c/c/g 

s/d/d/g'  | 
awk' 

BEGIN  {printf  "BEGIN  {i  =  0}\n"} 
NF==2{if($2  -- V) 
{printf  "NR  =  =  %d  {print  \"a\",$0  ;i  =  l}\n",$l 
printf  "NR  =  =  %d  {print  \"b\",$0  ;i  =  l}\n",($l  +  1)  } 
else  {printf  "NR=  =  %d  {print  \"%s\",$0  ;i  =  l}\n",$l,$2  }} 
NF=  =3  {for  (j  =$1U<  =$2y  +  + ) 
printf  "NR  =  =  %d  {print  \"%s\",$0  ;i  =  l}\n",j,$3} 
END{}'  result 
awk  -f  result  l.cb  | 
awk'  /~c/  {print $0  >  "temp" }' 
sed  's/c  /  /g 

s/{/  {/g 

s/}/}  /g 

s/"/"/g'  temp  >  final 
******  Calculating  Occurrence  of  Statement  Types  ****** 
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echo  "==============================">>  main. results 

echo  "File  Name  :  "  changes.with.TAB  >  >  main. results 

awk-fTYPEPGM  final 

******  Summing  the  Indentation  Level  ****** 

awk  -f  COUNT  final  |  awk  -f  NESTING 

rm  l.cb  2.cb  result  temp  final 

******  Finding  the  Differences  ****** 

diff-e$l$2  Igrep'^O-O]'  | 

******  Extracting  the  Changed  Statements  ****** 

sed  's/V  /g 

s/a/  a  /g 

s/c/  c  /g 

s/d/  d  /g'  | 
awk ' 

BEGIN  {printf  "BEGIN  {i  =  0}\n"} 
NF==2{if($2  =  =  "a") 
{printf  "NR  =  =  %d  {print  \"a\",$0  ;i  =  l}\n",$l 
printf  "NR==  %d  {print  \"bV,$0  ;i=  l}\n",($l  +  1)  } 
else  {printf  "NR=  =  %d  {print  \"%s\",$0  ;i  =  l}\n",$l,$2  }} 
NF=  =3{for(j  =  $l;j<  =$2j+  +) 
printf  "NR=  =  %d  {print  \"%s\",$0  ;i  =  l}\n"j,$3} 
END{}'  result 
awk -f  result  $1  | 
awk' 

/^c/ {print  $0  >  "temp"  }' 
sed  's/c  /  /g 

s/{/  {/g 

s/}/}  /g 

s/7 "  /g'  temp  >  final 
******  Calculating  Occurrence  of  Statements  Types  ****** 
echo  "=  =  =  =  =  =  =  =  =  =  =  =  =  =  =  =  =  =  =  =  =  =  =  =  =  =  =  =  =  =">>  main. results 

echo  "File  Name  :"  changes.without.TAB  >  >  main. results 

awk-fTYPEPGM  final 

rm  result  temp  final 

echo "%%%  This  is  end  of  data  collection"  >  >  main.results 
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APPENDIX  G 


SOURCE  CODE  OF  PICK  PROGRAM 


BEGIN  {} 

{ 

if  ($1  -  =  "%*%")  print  $1  >  'pick.file" 

if  (($NF !  =  "1987")  &&  ($1  !  ~  A  =/)  && 

($1  !  ~  /\-/)  &&  ($1  !  =  "Levels")  &&  ($1  !  ~  /\' 
&&  (NF !  =  0)  &&  ($NF  !  -  "data")) 

print  $NF  >  "pick.file" 

} 


CM 


APPENDIX  H 
SOURCE  CODE  OF  SEP  PROGRAM 


BEGIN  {no  =  0;flag=l} 

{ 
if  ($1  --  "%*%') 
{+  +no 
flag  =  1 

} 

if  (($1 !  ~  /\with/)  &&  ($1 !  ~  /\//)  &&  ($1  !  =  "%*%")) 

{ 

print  $0  >  no 

+  +flag 

} 

if  (flag  !  =  1) 

{if(($i  -  /\withy)  1 1  ($i  ~  /\//)) 

print  "\n"  no 
} 

} 
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APPENDIX  I 
SOURCE  CODE  OF  VERSION  1  OF  EXAMPLE 

I* 

/*  Procedure    :  Recreate  Listing                           Last  Revision :  * 

/• 

/*  Programmer    :  Mike  McClure  * 

I* 

I*  Description  :  This  module  accepts  the  data  array  record  and  the  * 

/*  counter  arrayas  inputs.  It  reads  the  index  value  ' 

/*  of  each  entity  name  of  the  data  array  and  recreates  * 

/*  the  listing  asit  was  originally  read  in.  This  module  ' 

/*  then  printsthe  listing.  The  module  append  error  is  * 

/*  called  by  this  module.  * 

/* 

^*******  +  *******   +   ********  +  .f  +   +  ********************  ************************* 


#include  <stdio.h> 

#include  <  /usrb/cs340/ldb/project/structure.h  > 

#include  <  /usrb/cs340/ldb/project/counter.h  > 

#define  begin  { 

#define  end  } 

#define  inc   +  + 

#define  EQ  =  = 

#define  NE   !  = 

#define  LE  <  = 

#defme  AND  && 

#define  NULL  70' 

RecreateListing  (DataRecordArray,  fp) 
struc  datarec  *Data_Record_Array; 

begin   /*  outer  loop  of  data  structure  */ 
int  Lj,k;  /*  looping  variables  */ 
FILE  *fp; 

for  (i=0;  Data_Record_Array[i].Procedure|0]  NE  NULL  AND 
i  LE  MAX_STRUC_ARR  -  1;  inc  i) 
fprintf  (fp,"\n  PROCEDURE  :  %s",Data_Record  Array|ij.Procedure[j]); 
if  (Data_Record_Array[i].Booleanl[j]  >  0) 
AppendError; 

begin   /*  loping  through  procedure  arrays  */ 
for  0  =0;  j  LE  MAX  FLD  ARR-1;  inc  j) 
begin 

fprintf  (fp,"\n  CALLS  :  %s",Data_Record_Array[i].Calls[j]); 
+  +  Calls; 
end 


if  (Data_Record_Array[i].Boolean2[j]  >  0) 
Append  Error; 

for  (j  =0;  j  LE  MAX  STRUC_ARR  -1;  inc  j) 

fprintf(fp,"\n  EXTERNAL  INPUT  :  %s",Data_Record_Array|i|.Exl_lnput[j|); 
if  (Data  Record  Array[i].Boolean3[j]  0) 
Append  Error; 

for  (j  =0:  j  LE  MAX_STRUC_ARR-1;  inc  j) 
begin 

fprintf  (fp,"\n  INPUT  GLOBAL  :  %s",Data_Record_Array[i].Input_Global[j|); 
+  +GIobals; 
end 

if  (Data_Record_Array(i].Boolean4[j]  0) 
Append_Error; 

for  (j  =0;  j  LE  MAX  STRUCARR-1;  inc  j) 

fprintf  (fp,"\n  INPUT  PARAMETER  :  %s",Data_Record  Array|i|.Input_Parameter[j]); 
if(Data_Record_Arrayli].Boolean5(jl  0) 
AppendError; 

for  (j  =0;  j  LE  MAX  STRUC_ARR-1;  incj) 

fprintf  (fp,"\n  EXTERNAL  OUTPUT :  %s\Data_Record_Array[i].Ext_Output[j]); 
if  (DataRecord  Array(iJ.Boolean6[jj  0) 
AppendError; 

for  (j=0;j  LE  MAX_STRUC_ARR-1;  incj) 
begin 

fprintf  (fp,"\nOUTPUT_GLOBAL :  %s",Data_Record_Array[i].Output_global[j]); 
+  +Globals; 
end 

if  (DataRecord  Array[i].Boolean7[j|  0) 
AppendError; 

for  (j  =0;  j  LE  MAX  STRUC_ARR-1;  incj) 

fprintf(fp,"\nOUTPUT_PARAMETER:%s",Data_Record_Array[i].Output_parameter[j]); 
If  (Data_Record_Array[i].Boolean8[j]  0) 
AppendError; 

for  (j  =0;  j  LE  MAX  STRUC_ARR-1;  incj) 

fprintf  (fp,"\n  ILLEGAL  :  %s",Data_Record_Array[i].Illegal(j]); 
If  (Data_Record_Array[i].Boolean9(j]  0) 
AppendError; 

for  (j  =0;  j  LE  MAX_STRUC_ARR-1;  incj) 

fprintf  (fp,"\n  IGNORED  :  %s",Data_Record_Array|i].Ignored[j]); 
end 
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APPENDIX  J 
SOURCE  CODE  OF  VERSION  2  OF  EXAMPLE 


/*  */ 

/*  Procedure     :  Recreate  Listing       Last  Revision  :  4-30-86  mm  */ 

/*  */ 

/*  Programmer    :  Mike  McClure  */ 

/*  •/ 

/*  Description  :  This  module  accepts  the  data  array  record  and  the  */ 

/*  counter  arrayas  inputs.  It  reads  the  index  value  */ 

/*  of  each  entity  name  of  the  data  array  and  recreates  */ 

/*  the  listing  asit  was  originally  read  in.  This  module  */ 

/*  then  printsthe  listing.  The  module  append  error  is  */ 

/*  called  by  this  module.  */ 

/*  */ 


#include  <stdio.h> 

#include  <  /usrb/cs340/ldb/project/structure.h  > 

#include  </usrb/cs340/ldb/project/counter.h  > 

#define  begin  { 

#define  end  } 

#define  inc  +  + 

#define  EQ    -  = 

#define  NE   !  = 

#define  LE<  = 

#define  AND  && 


Recreate_Listing(Counter_Array,Data_Record_Array) 
struct  datarec  *Data  RecordArray; 
rec  *Counter_Array; 


begin   /*  outer  loop  of  data  structure  */ 

int  i,j;  /*  looping  variables  */ 
FILE  *fp,  *fopen(); 


fp  =  fopen("final_report","a"); 

printf("\n  I  am  in  reclist  &  befor  for  loop  start\n"); 

for  (i  =  0;  Data_Record_Array[i].Procedure[0]  NE  '\0'  AND 

i  LE  MAX_STRUC_ARR  - 1;  inc  i) 
begin  /*  looping  through  procedure  arrays  */ 
printf("I  am  in  1st  for  loop  for  structure  array\n"); 

fprintf(fp,>7"); 

fprintf(fp,>7"); 

fprintf  (fp,"\nPROCEDURE  :  %s",Data_Record_Array|il.Procedure); 
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if  (Data_Record_Array[i].Booleanl  0) 

{    " 

fclose(fp); 

Append  Error(Counter  Array,& Data_Record_Array| i ] .Boolean  1 ) ; 

fp  =  fopen("final  report","a"); 
} 

for(j  =0J  LE  MAX  FLDARR-1  AND 

Data_Record_Array[i].Calls[j][0]  NE  '\0';  inc  j) 
begin 

printf("\n  I  am  in  field  loop  array\n"); 

fprintf(fp,"\n  CALLS  :  %s",Data_Record_Array[i|.Calls[j]); 

if  (Data  Record_Array[ij.Boolean2(j]  0) 

{ 

fclose(fp); 

Append_Error(Counter_Array,&  DataRecordAr  ray  [  i  ] .  Boolean2[j  | ) ; 
fp  =  fopen("final  report'V'a"); 

} 

inc  Counter_Array[12].value; 
end 

for(j  =0j  LE  MAX  STRUC_ARR  -1  AND 

DataRecord  Array[iJ.Ext_input[j][0]  NE  '\0';  inc  j) 
begin 

fprintf(fp,"\n  EXTINPUT         :  %sn)Data_Record_Array[i].Ext_input(j|); 

if  (Data_Record_Array[i|.Boolean3(j]  0) 

{ 
fclose(fp); 

Append_Error(Counter_Array,&Data_Record_Array[i|.Boolean3[j]); 
fp  =  fopen  ("final_report","a") 

} 
end 

for  0  =0;  j  LE  MAX_STRUC_ARR-1  AND 

Data_Record_Array[i].Input_global[j][0]  NE  '\0';  incj) 
begin 

fprinlf  (fp,"\n  INPUT_GLOBAL      :  %s",Data_Record_Array[i].Input_global[j]); 

if  (Data_Record_Array[i].Boolean4fj]  0) 

{ 

fclose(fp); 

Append_Error(Counter_Array,&Data_Record_Array[  i] .  Boolean4(j  ]) ; 
fp  =  fopen  ("final_report","a"); 


inc  Counter_Array[ll].value; 


end 


for  0  =0;  j  LE  MAX_STRUC_ARR-1  AND 

Data_Record_Array[i].Input_parameter[j][0]  NE  '\0';inc  j) 

begin 

fprintf(fp,"\nINPUT_VALUE:%s",Data_Record_Array[i].Inpul_parameter(j]); 
if  (Data_Record_Array[i].Boolean5(j]  0) 


< 

fclose(fp); 

Append_Error(Counter_Array,&Data  Record  Array[i].Boolean5[j]); 
fp  =  fopen  ("fmal_report","a"); 
} 


end 


for  0=0;  j  LE  MAX_STRUC_ARR-1  AND 

Data  Record_Array[i).Ext_output[j][0]  NE  '\0';inc  j) 
begin 

fprintf(fp,"\nEXTERNAL_OUTPUT:%sn,Data_Record_Array|i].Ext_output|j|); 

if  (Dala_Record_Array[i].Boolean6(j]  0) 

{ 

fclose(fp); 

Append_Error(Counter_Array,&Data_Record_Array[i].Boolean6[j|); 
fp  =  fopen  ("final_report","a"); 

} 
end 

for  0  =  0;  j  LE  MAX_STRUC_ARR-1  AND 

Data_Record_Array[il.Output_global[j](0]  NE  '\0';incj) 
begin 

fprintf(fp,"\nOUTPUT_GLOBAL:%s",Data_Record_Array[i].Output_global|j]); 

if  (Data_Record_Array[i].Boolean7[j]  0) 

{    ' 

fclose(fp); 

Append_Error(Counter_Array,&Data  Record_Array[i].Boolean7[j]); 
fp  =  fopen  ("final_report",V); 

} 

inc  Counter_Array[llJ.value; 
end 

for  (j  =  0;j  LE  MAX  STRUCARR-1  AND 

Data_Record_Array[i].Output_parameter(j][0]  NE  '\0';inc  j) 
begin 

fprintf(fp,"\nOUTPUT_NO:%s")Data_Record_Array[i].Output_parameterfj]); 

if  (Data_Record_Array[i].Boolean8[j]  0) 

{ 
fclose(fp); 

Append_Error(Counter_Array,&Data_Record_Array[i].Boolean8[j|); 
fp  =  fopen  ("final  report","a");  }      end 

for  (j  =0;  j  LE  MAX  STRUC_ARR  -1  AND 

Data_Record_Array[iJ.Illegal[j][0]  NE  '\0';inc  j) 
begin 

fprintf  (fp,"\n  Illegal        :  %s",Data_Record_Array(iJ.Illegal[j|); 

if  (Data_Record_Array[i].Boolean9[jl  0) 

{ 
fclose(fp); 

Append_Error(Counter_Array,&Data_Record_Array[i].Boolean9[j]); 

fp  =  fopen  ("final_report","a"); 

1 


J-3 


end 

for  ( j  =  0;  j  LE  MAX  STRUC_ARR-1  AND 

Data_Record_Array|i].Ignored|j][0]  NE  '\0';inc  j) 
fprintf  (fp,  "\n  Ignored      :  %s",  DataRecord  Array[i].Ignorcd(j|); 
end 

fclose(fp); 
end 
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APPENDIX  K 
SAMPLE  OUTPUT  FROM  PROBIT  FOR  QUANTITATIVE  ANALYSIS  OF  DEBUGGING 


1  SAS(R)    LOG        OS    SAS    5.16  OS/MVT   JOB   VM185600    STEP 

SUBMIT         PROC   SAS  18:56   WEDNESDAY,    APRIL    6,    1988 

NOTE:    COPYRIGHT    (C)    1984,1986    SAS    INSTITUTE    INC.,    CARY,    N.C.       27511, 

U.S.A. 

NOTE:  THE  JOB  VM185600  HAS  BEEN  RUN  UNDER  RELEASE  5.16  OF  SAS  AT  KANSAS 

STATE  UNIVERSITY  (03010001). 

NOTE:  SAS  OPTIONS  SPECIFIED  ARE: 

NOINCLUDE  NOGRAPHICS   SORT=4 

NOTE:   SAS  5.16  has  replaced  SAS  82.3. 

1  OPTIONS   LS=72; 

2  DATA; 

3  INPUT  DOSE  N  RESPONSE; 

4  CARDS; 

NOTE:  DATA  SET  WORK.DATA1  HAS  13  OBSERVATIONS  AND  3  VARIABLES.  680  OBS/T 

RK 

NOTE:  THE  DATA  STATEMENT  USED  0.20  SECONDS  AND  372K. 

18 

19  PROC  PRINT; 

20  VAR  DOSE  N  RESPONSE; 

NOTE:  THE  PROCEDURE  PRINT  USED  0.25  SECONDS  AND  422K 
AND  PRINTED  PAGE  1. 

21  PROC  PROBIT  LOG10; 

22  VAR  DOSE  N  RESPONSE; 

NOTE:  THE  PROCEDURE  PROBIT  USED  0.54  SECONDS  AND  420K 

AND  PRINTED  PAGES  2  TO  6 . 
NOTE:  SAS  USED  422K  MEMORY. 

NOTE:  SAS  INSTITUTE  INC. 
SAS  CIRCLE 
PO  BOX  8000 
CARY,  N.C.  27511-8000 
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SAS  1 

18:56  WEDNESDAY,  APRIL  6,  1988 
OBS     DOSE      N     RESPONSE 

1  1 

1  1 

1  1 

1  1 

1  0 

2  1 

3  2 
1  1 
1  0 

4  2 
4  1 
3  0 

13      0      41  0 


1 

32 

2 

24 

3 

19 

4 

18 

5 

11 

6 

9 

7 

8 

8 

7 

9 

5 

10 

3 

11 

2 

12 

1 
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SAS  2 

18:56  WEDNESDAY,  APRIL  6,  1988 
PROBIT  ANALYSIS  ON  LOGIO(DOSE) 


ITERATION 

0 

1 
2 
3 

4 


INTERCEPT 

SLOPE 

MU 

SIGMA 

4.19764739 
3.66586399 
3.58409386 
3.58208644 
3.58208523 

1. 
1, 
1. 

1, 

1 

.12527653 
.86023043 
.97505921 
.97790611 
.97790783 

0 
0, 

0 
0 

0 

.71302706 
.71718858 
.71689301 
.71687607 
.71687606 

0. 

0 
0 

0 
0 

.88867045 
.53756781 
.50631393 
.50558517 
.50558473 

COVARIANCE  MATRIX 


INTERCEPT 


SLOPE 


INTERCEPT 
SLOPE 


0.39202706 
-0.44235206 


-0.44235206 
0.64162248 


COVARIANCE  MATRIX 


MU 


SIGMA 


MD 
SIGMA 


0.02237685 
0.00227606 


0.00227606 
0.04192329 


CHI-SQ  =      6.1631  WITH 


10  DF   PROB  >  CHI-SQ  =  0.8014 


NOTE:  SINCE  THE  CHI-SQUARE  IS  SMALL  (P  >  0.10),  FIDUCIAL 
LIMITS  WILL  BE  COMPUTED  USING  A  T  VALUE  OF  1.96  . 
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PROBIT 
10  + 

9  + 

8  + 

7  + 


SAS  3 

18:56  WEDNESDAY,  APRIL  6,  1988 
PROBIT  ANALYSIS  ON  LOGIO(DOSE) 

X         XX  X  X 


6  + 


5  + 


4  + 


3  + 


2  +. 


1  + 


0  +  X 

+ + 

LD01     LD05 
-0.459   -0.115 


X 

X 

LD25 

LD50 

LD75 

LD95 

LD99 

0.376 

0.717 

1.058 

1.548 

1.893 
LOGIO(DOSE) 
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PROBABILITY 
1.0  + 


0.9  + 


SAS  4 

18:56  WEDNESDAY,  APRIL  6,  1988 
PROBIT  ANALYSIS  ON  LOGIO(DOSE) 


0.8    + 


0.7    + 


0.6    + 


0.5    + 


0.4    + 


0.3    + 


0.2    + 


0.1   + 


o-o   + X  XX 

+ + + + +. 

LD01  LD05  LD25  LD50         LD75 

-0.459      -0.115  0.376        0.717      1.058 


XX  X  X 


+ + 

LD95     LD99 
1.548    1.893 

LOGIO(DOSE) 
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„„„„„  18:56    WEDNESDAY,    APRIL    6,    1988 

PROBIT   ANALYSIS    ON    LOGIO(DOSE) 


PROBABILITY 


0.01 

0.02 

.03 

.04 
05 
06 
07 
08 
09 
10 
15 
20 
25 
30 
35 
40 


0.45 

0.50 

0.55 

0.60 

0.65 

0.70 

0.75 

0.80 

0.85 

0.90 

0.91 

0.92 

0.93 

0.94 

0.95 

0.96 

0.97 

0.98 

0.99 


LOGIO(DOSE) 

-0.45928991 
-0.32146803 
-0.23402447 
-0.16824409 
-0.11473682 
-0.06919373 
-0.02926135 
0.00649333 
.03901078 
06894315 
19287116 
29136521 
.37586434 
45174716 
52206391 
58878763 
0.65334360 
0.71687606 
0.78040852 
0.84496448 
0.91168820 
0.98200495 
1.05788778 
1.14238690 
1.24088095 
1.36480896 
1.39474133 
1.42725878 
1,46301347 
1.50294585 
1.54848894 
1.60199620 
1.66777659 
1.75522015 
1.89304202 


95  PERCENT 
LOWER 
-4.92685122 
-4.26477822 
-3.84563565 
-3.53095349 
-3.27547273 
-3.05843356 
-2.86850248 
-2.69878239 
-2.54475000 
-2.40327159 
-1.82167412 
-1.36666280 
-0.98505006 
-0.65387609 
-0.36300645 
-0.10959796 
0.10498151 
0.27931762 
0.41652275 
0.52466424 
0.61309224 
0.68962129 
0.76016205 
0.82955995 
0.90293269 
0.98829210 
1.00811672 
1.02938413 
1.05248274 
1.07796964 
1.10669008 
1.14002741 
1.18050666 
1.23361420 
1.31606856 


FIDUCIAL  LIMITS 
UPPER 
0.09547234 
0.17849658 
0.23209431 
0.27303627 
0.30682829 
0.33600569 
0.36195838 
0.38553646 
0.40730102 
0.42764398 
0.51603026 
0.59350078 
0.66871004 
0.74777635 
0.83705548 
0.94437098 
1.07879615 
1.24793134 
1.45419751 
1.69506067 
1.96735667 
2.27097637 
2.61067589 
2.99810000 
3.45720910 
4.04183343 
4.18383019 
4.33835972 
4.50855928 
4.69895615 
4.91645229 
5.17238774 
5.48753260 
5.90716537 
6.56980824 
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SAS  6 

18:56  WEDNESDAY,  APRIL  6,  1988 
PROBIT  ANALYSIS  ON  LOGIO(DOSE) 


PROBABILITY 

DOSE 

0.01 

0.34730425 

0.02 

0.47701492 

0.03 

0.58341223 

0.04 

0.67882200 

0.05 

0.76782664 

0.06 

0.85271964 

0.07 

0.93484293 

0.08 

1.01506378 

0.09 

1.09398353 

0.10 

1.17204194 

0.15 

1.55908991 

0.20 

1.95598361 

0.25 

2.37609794 

0.30 

2.82974411 

0.35 

3.32708512 

0.40 

3.87960607 

0.45 

4.50135846 

0.50 

5.21045989 

0.55 

6.03126645 

0.60 

6.99784766 

0.65 

8.15996323 

0.70 

9.59411565 

0.75 

11.42583048 

0.80 

13.87991806 

0.85 

17.41329485 

0.90 

23.16375491 

0.91 

24.81654571 

0.92 

26.74599656 

0.93 

29.04112710 

0.94 

31.83800505 

0.95 

35.35810134 

0.96 

39.99412541 

0.97 

46.53466462 

0.98 

56.91413614 

0.99 

78.17034381 

95  PERCENT 
LOWER 
0.00001183 
0.00005435 
0.00014268 
0.00029447 
0.00053031 
0.00087411 
0.00135362 
0.00200086 
0.00285266 
0.00395119 
0.01507738 
0.04298701 
0.10350229 
0.22188294 
0.43350444 
0.77696604 
1.27344885 
1.90246913 
2.60929243 
3.34706570 
4.10291234 
4.89351910 
5.75654690 
6.75398276 
7.99710297 
9.73401698 
10.18865176 
10.70000878 
11.28451091 
11.96656871 
12.78468643 
13.80471374      1 
15.15328058      3 
17.12435399      8 
20.70468193     37 


FIDUCIAL  LIMITS 
UPPER 
1.24586889 
1.50833072 
1.70645292 
1.87515109 
2.02688118 
2.16773250 
2.30122125 
2.42960940 
2.55447124 
2.67697296 
3.28118154 
3.92193848 
4.66347916 
5.59469418 
6.87156210 
8.79773713 
11.98936417 
17.69829150 
28.45755029 
49.55194094 
92.75912967 
186.62781588 
408.01477870 
995.63464374 
2865.55731738 
11011.16909866 
15269.68899306 
21795.14307647 
32252.19550293 
49998.40542578 
82499.68400879 
48726.29000454 
07278.80195588 
07542.47142539 
13712.13215244 
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ABSTRACT 

The  historical  data  of  a  program  collected  during  its 
development  phase  contain  important  information  regarding 
the  activities  of  the  software  development.  This  work 
proposes,  both  qualitatively  and  quantitatively,  a  program 
change  pattern  classification  and  a  set  of  intuitive  rules 
for  effective  evaluation  of  the  changes  during  software 
development.  It  is  important  that  a  software  manager  sees 
and  interprets  the  pattern  changes  during  software 
development.  The  intuitive  rules  are  designed  to 
facilitate  the  analysis  of  those  changes;  the  results  can 
be  used  to  aid  the  software  manager  in  evaluating  the 
software  development. 


